Below is the high level architecture for the Data Quality KPIs monitoring tool.

A Master Talend job orchestrates the entire data processing pipeline:
1. Data Ingestion
The process begins with the ingestion of data for each domain from various SAP source systems:
SSR data is sourced from SAP WP1 and SAP PF1
FIN data is sourced from SAP BW, WP1, and SAP PF1
MRK data is sourced from SAP BW, WP1, and SAP PF1
Each table from each domain has its own dedicated Talend job responsible for ingesting and loading the data into the GCP BigQuery Data Ocean, specifically the following datasets:
2. Data Processing and Transformation
After ingestion, two routines are executed to populate the Data Model (DM) Dimension tables:
Routine prj-data-dq-selfservice-[env].DM.insert_DIM_Domain populates prj-data-dq-selfservice-[env].DM.DIM_domain table
Routine prj-data-dq-selfservice-[env].DM.insert_DIM_kpi_dimension populates prj-data-dq-selfservice-[env].DM.DIM_kpi_dimension table
Also, views including only the necessary data are created in the following datasets:
This views are the sole source for the for the quality checks performed by Dataplex.
3. Data Quality Execution in Dataplex
Once the views are created, the data quality rules are executed using GCP Dataplex Service and the validation results are stored in the following BigQuery table:
4. Data Model Population
A routine is executed to populate the DM Fact tables:
Routine prj-data-dq-selfservice-[env].DM.RT_DPtoDMmapping_Datespecific populates the following tables:
prj-data-dq-selfservice-[env].DM.DIM_DATE
prj-data-dq-selfservice-[env].DM.DIM_quality_rule
prj-data-dq-selfservice-[env].DM.FACT_data_quality
prj-data-dq-selfservice-[env].DM.FACT_failed_records
5. Failed Records Handling & Export
A final Talend job - PL_DQ_BQ_to_Gshet_Selfservice - handles failed records:
[env] is one of the following: dev, test, ppd, prod
6. Visualization in Qlik Sense
The processed and validated data is available for visualization and analysis in Qlik Sense.

The following process are scheduled on a weekly basis.
1. Talend Ingestion Jobs
The Ingestion Jobs are scheduled to run within Talend every week, at the beginning of the process.
2. Data Quality Scans
Initially "On Demand" for testing purposes and then "Scheduled" to run every week within Dataplex.
3. Routines Execution
The 3 routines are triggered using scheduled queries on a weekly basis within BigQuery.
4. Talend Report Job
The Talend Job PL_DQ_BQ_to_Gshet_Selfservice is scheduled to run within Talend every week, at the end of the process.
5. QlikSense Refresh
The QlikSense refresh schedule is set by the Visualization Engineer within QlikSense.
Bellow you can find a table that summarizes the processes, their frequency, duration window and average duration.
| Process | Frequency | Duration Window | Average Duration (min) |
| Talend Ingestion Jobs | Every Sunday | 21:00 CET | |
| Dataplex Data Quality Scans | Every Monday | 4:00 - 5:00 CET | 1 |
| BigQuery Routine insert_DIM_Domain | Every Monday | 5:00 - 5:05 CET | 1 |
| BigQuery Routine insert_DIM_kpi_dimension | Every Monday | 5:05 - 5:10 CET | 1 |
| BigQuery Routine RT_DPtoDMmapping_Datespecific | Every Monday | 5:10 - 5:15 CET | 1 |
| Talend Report Job PL_DQ_BQ_to_Gshet_Selfservice | Every Monday | 6:00 - 7:00 CET | 5 |
| QlikSense | Every Monday | 8:00 CET | 1 |
GCP Monitoring tools:
No Identified Bugs.