Page tree


You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

This document provides an overview of the two main Google Cloud Platform (GCP) services leveraged in this project: BigQuery and Dataplex. These services work together to enable efficient data storage, processing, and quality monitoring.

  • BigQuery is GCP's fully managed, serverless data warehouse designed for fast SQL-based analysis on large datasets. It is used to store, organize, and process data ingested from various source systems. BigQuery serves as the central data repository for this project.
  • Dataplex is GCP's intelligent data fabric solution, which allows for unified data management, governance, and quality control across distributed data. In this project, Dataplex is responsible for the execution of data quality rules, ensuring the consistency and reliability of data processed and stored in BigQuery.

Project Hierarchy

The project structure within GCP has been organized to ensure clear separation between domains and environments, while supporting an efficient data ingestion and data quality validation process.

Domain-specific Projects

For each domain within the Data Quality Monitoring Tool (DQMT), a dedicated GCP project exists for every environment:

  • Development (dev)

  • Testing (test)

  • Pre-Production (ppd)

  • Production (prod)

Each project serves as the location where:

  • The data is ingested from source systems.
  • The initial views used for data processing and quality checks are created.

Domain Projects List

Domain\EnvironmentDevelopmentTestingPre-ProductionProduction
Human Resourcesprj-data-dm-hr-devprj-data-dm-hr-testprj-data-dm-hr-ppdprj-data-dm-hr-prod
Structure & Sharedprj-data-dm-structure-devprj-data-dm-structure-testprj-data-dm-structure-ppdprj-data-dm-structure-prod
Financeprj-data-dm-finance-devprj-data-dm-finance-testprj-data-dm-finance-ppdprj-data-dm-finance-prod
Marketingprj-data-dm-marketing-devprj-data-dm-marketing-testprj-data-dm-marketing-ppdprj-data-dm-marketing-prod
Procurementprj-data-dm-procurement-devprj-data-dm-procurement-testprj-data-dm-procurement-ppdprj-data-dm-procurement-prod

Data Quality & Final Views Projects

In addition to the domain-specific projects, a separate set of projects is used to:

  • Import the final views generated by the domain projects.

  • Define and execute data quality rules through Dataplex.

EnvironmentProject
Developmentprj-data-dq-selfservice-dev
Testingprj-data-dq-selfservice-test
Pre-Productionprj-data-dq-selfservice-ppd
Productionprj-data-dq-selfservice-prod

Project Support Teams

The successful implementation and maintenance of the DQMT solution is supported by a dedicated team, each contributing with specialized skills across different areas of the project:

NameRoleScope
Ahmed ElsayedData ArchitectArchitecture and design of data pipelines and models
Maria João PimentaData EngineerData ingestion, transformation, and automation
Ram AtirajyamData EngineerData ingestion, transformation, and automation
Ibrahim ManseyVisualization EngineerData visualization and dashboard development
Mohamed HazemVisualization EngineerData visualization and dashboard development
Rawan ShehabFunctional AnalystBusiness analysis and functional requirements

Project Access and Service Accounts

Google Groups

The following Google Groups have access to the DQMT GCP projects, organized by role:

GroupPurposeEmail
Data Architects GroupAccess for Data Architectsgcp-da-prj-data-dq-selfservice-nonprod@solvay.com
Data Engineers GroupAccess for Data Engineersgcp-de-prj-data-dq-selfservice-nonprod@solvay.com
Data Analysts / Business AnalystsNo specific group
Data Developers GroupAccess for Data Developersgcp-dv-prj-data-dq-selfservice@solvay.com

Note: No new Google Groups were specifically created for this project.


Service Accounts

The following Service Accounts are used within the DQMT project for process automation and integration:

BigQuery