Page tree


You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »


Project Hierarchy in GCP (Org/Folder/Project Name):

  • prj-data-dq-selfservice-test
  • prj-data-dq-selfservice-dev
  • prj-data-dq-selfservice-ppd
  • prj-data-dq-selfservice-prod

GCP Project Overview: -

  • Project Name: prj-data-dq-selfservice-dev
  • Project ID: prj-data-dq-selfservice-dev
  • Project Description:
  • Project Owner: 
  • Project purpose: Provide Data Quality KPIs Dashboard, Insights, Failed records for Different scopes and domains across Solvay (Human Resources, Marketing and Sales, Finance, Structured and Shared, Procurement, Supply Chain). 
  • Project support teams (services scope):
    • Ahmed Elsayed - Data Architect
    • Maria Joao Pimenta -  Data Engineer
    • Ram Atirajyam - Data Engineer
    • Ibrahim Mansey - Visualization Engineer
    • Mohamed Hazem - Visualization Engineer
    • Rawan Shehab Functional Analyst


GCP Resources/Services – Used in project:

  • Dataplex
  • Big Query
  • GCP Buckets/storage
  • Query Scheduler

Google Groups: -

  1. Google Groups having access to this project:
      2. Google groups newly created for this project:
  • None

                          

Service Accounts used by this project: -

GCP Buckets:   

  • Bucket name, Folder, Objects – Path etc.
  • Retention period
  • Access policies

Big query:

  • Datasets:
    • DM
    • DPL
    • DataOcean_dataquality_kpi
    • Dataplex_profiles_scans
  • Tables:
    • DM:
      • DIM_date
      • DIM_domain
      • DIM_kpi_dimension
      • DIM_quality_rule
      • Dataplex_quality
      • FACT_data_quality
      • FACT_failed_records
    • DPL: (Views)
    • DataOcean_dataquality_kpi: (Views)
    • Dataplex_profiles_scans:
      • EmpBusiness-scan
      • PositionJobInfo
      • businessunit
      • empJobPositionJoin
      • empLocGroup_scan
      • empcomp-scan
      • empjob_profile
      • emploc
  • SQL Queries/views – Logic.
    • DM: (Tables)
    • DPL:
      • V_DIM_DATE
      • V_DIM_DOMAIN
      • V_DIM_KPI_DIMENSION
      • V_FACT_QUALITY
      • V_RULE_QUALITY
      • V_data_quality_metrics_dev
    • DataOcean_dataquality_kpi:
      • V_EmpJobRelationships
      • V_EmpWorkPermit
      • V_FOLocation
      • V_LocationGroup
      • V_PositionJobInfo
      • V_User
      • V_businessunit
      • V_company
      • V_costCenter
      • V_empJobCC
      • V_empLocGroup
      • V_emp_compensation_job
      • V_position
      • V_ActiveEmployeeInActiveLegalEntity
      • V_BusinessUnit
      • V_CcHrFin
      • V_Company
      • V_EmpBusiness
      • V_EmpCompPay
      • V_EmpCompensation
      • V_EmpJob
      • V_EmpJobCompPay
    • Dataplex_profiles_scans: (Partition Tables)


  • Routines (Stored Procedures):
    • DM.RT_DPtoDMmapping_Datespecific ( It is the main mapping function in order to populate the Model according to the latest weekly runs)

  -- Populate DIM_date table
  -- Populate DIM_quality_rule table
  -- Populate FACT_data_quality table
  -- Populate FACT_failed_records table

    • RT_DPtoDMmapping_specific: (used to map specific rule in case there's an on demand run)

  -- Populate DIM_date table
  -- Populate DIM_quality_rule table
  -- Populate FACT_data_quality table
  -- Populate FACT_failed_records table


Data flow:

    • Data Quality Check: Sources → Talend → Data Ocean → prj-data-dq-selfservice-*** → DataOcean_dataquality_kpi (Data Set)→ DataOcean_dataquality_kpi.Views (source views) → Dataplex → Dataplex_quality (Table) →RT_DPtoDMmapping_specific (Stored Procedure) → DIM_dateDIM_domainDIM_kpi_dimensionDIM_quality_ruleDataplex_qualityFACT_data_qualityFACT_failed_records (DM tables) → (DPL Views) → QlikSense


ProjectName:

  • prj-data-dq-selfservice-test
  • prj-data-dq-selfservice-dev
  • prj-data-dq-selfservice-ppd
  • prj-data-dq-selfservice-prod


prj-data-dq-selfservice-prod

STG Schemas:

STG Schemas1:

STG Schema2:

STG Schema3:

Tablelist:


Data Ocean Schemas:

DS_xxx_yyy1 

DS_xxx_yyy2 

DS_xxx_yyy3


Reporting Schemas: