This document provides an overview of the two main Google Cloud Platform (GCP) services leveraged in this project: BigQuery and Dataplex. These services work together to enable efficient data storage, processing, and quality monitoring.

Project Hierarchy

The project structure within GCP has been organized to ensure clear separation between domains and environments, while supporting an efficient data ingestion and data quality validation process.

Domain-specific Projects

For each domain within the Data Quality Monitoring Tool (DQMT), a dedicated GCP project exists for every environment:

Each project serves as the location where:

Domain Projects List

Domain\EnvironmentDevelopmentTestingPre-ProductionProduction
Human Resourcesprj-data-dm-hr-devprj-data-dm-hr-testprj-data-dm-hr-ppdprj-data-dm-hr-prod
Structure & Sharedprj-data-dm-structure-devprj-data-dm-structure-testprj-data-dm-structure-ppdprj-data-dm-structure-prod
Financeprj-data-dm-finance-devprj-data-dm-finance-testprj-data-dm-finance-ppdprj-data-dm-finance-prod
Marketingprj-data-dm-marketing-devprj-data-dm-marketing-testprj-data-dm-marketing-ppdprj-data-dm-marketing-prod
Procurementprj-data-dm-procurement-devprj-data-dm-procurement-testprj-data-dm-procurement-ppdprj-data-dm-procurement-prod

Data Quality & Final Views Projects

In addition to the domain-specific projects, a separate set of projects is used to:

EnvironmentProject
Developmentprj-data-dq-selfservice-dev
Testingprj-data-dq-selfservice-test
Pre-Productionprj-data-dq-selfservice-ppd
Productionprj-data-dq-selfservice-prod

Project Support Teams

The successful implementation and maintenance of the DQMT solution is supported by a dedicated team, each contributing with specialized skills across different areas of the project:

NameRoleScope
Ahmed ElsayedData ArchitectArchitecture and design of data pipelines and models
Maria João PimentaData EngineerData ingestion, transformation, and automation
Ram AtirajyamData EngineerData ingestion, transformation, and automation
Ibrahim ManseyVisualization EngineerData visualization and dashboard development
Mohamed HazemVisualization EngineerData visualization and dashboard development
Rawan ShehabFunctional AnalystBusiness analysis and functional requirements

Project Access and Service Accounts

Google Groups

The following Google Groups have access to the DQMT GCP projects, organized by role:

GroupPurposeEmail
Data Architects GroupAccess for Data Architectsgcp-da-prj-data-dq-selfservice-nonprod@solvay.com
Data Engineers GroupAccess for Data Engineersgcp-de-prj-data-dq-selfservice-nonprod@solvay.com
Data Analysts / Business AnalystsNo specific group
Data Developers GroupAccess for Data Developersgcp-dv-prj-data-dq-selfservice@solvay.com

Note: No new Google Groups were specifically created for this project.


Service Accounts

The following Service Accounts are used within the DQMT project for process automation and integration:

Service AccountDescription
sbs-is-appli-qlikview.support@solvay.comQlikView integration and support
sa-talend@prj-data-dq-selfservice-dev.iam.gserviceaccount.comTalend jobs execution
sa-cloudfunction@prj-data-dq-selfservice-dev.iam.gserviceaccount.comCloud Functions automation

BigQuery in the Data Quality Monitoring Project

Big query:


  -- Populate DIM_date table
  -- Populate DIM_quality_rule table
  -- Populate FACT_data_quality table
  -- Populate FACT_failed_records table

  -- Populate DIM_date table
  -- Populate DIM_quality_rule table
  -- Populate FACT_data_quality table
  -- Populate FACT_failed_records table


Data flow:


ProjectName:


prj-data-dq-selfservice-prod

STG Schemas:

STG Schemas1:

STG Schema2:

STG Schema3:

Tablelist:


Data Ocean Schemas:

GCP ProjectSTG TableODS table
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_EmpJobODS_SFC_0000_F001_F_D_EmpJob
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_EmpEmploymentODS_SFC_0000_F001_F_D_EmpEmployment
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_EmpEmploymentTerminationODS_SFC_0000_F001_F_D_EmpEmploymentTermination
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_EmpPayCompRecurringODS_SFC_0000_F001_F_D_EmpPayCompRecurring
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_EmpCompensationODS_SFC_0000_F001_F_W_EmpCompensation
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_FOPayComponentODS_SFC_0000_F001_F_D_FOPayComponent
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_EmpJobRelationshipsODS_SFC_0000_F001_F_D_EmpJobRelationships
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_PerEmailODS_SFC_0000_F001_F_D_PerEmail
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_PerPersonODS_SFC_0000_F001_F_D_PerPerson
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_PerPersonalODS_SFC_0000_F001_F_D_PerPersonal
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOEventReasonODS_SFC_0000_F001_F_W_FOEventReason
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_UserAccountODS_SFC_0000_F001_F_W_UserAccount
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOLocationGroupODS_SFC_0000_F001_F_W_FOLocationGroup
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOCompanyODS_SFC_0000_F001_F_W_FOCompany
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOCostCenterODS_SFC_0000_F001_F_W_FOCostCenter
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOBusinessUnitODS_SFC_0000_F001_F_W_FOBusinessUnit
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_PickListValueV2ODS_SFC_0000_F001_F_W_PickListValueV2
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_PositionODS_SFC_0000_F001_F_W_Position
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_EmpWorkPermitODS_SFC_0000_F001_F_W_EmpWorkPermit
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOLocationODS_SFC_0000_F001_F_W_FOLocation
prj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_UserODS_SFC_0000_F001_F_W_User



Reporting Schemas: