Page tree


You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Current »

Architecture:



Data Flow

wip

Sources DefinitionIdentify different data sources and how to ingest either raw data, or dq metrics depending on the source
Identified SourcesSAP BW, SuccessFactors, Big Query (Data Ocean)
ETL and Preparation

Talend used to load and ingest tables in BigQuery, Leveraging automation and orchestration capabilities to automate the recurring jobs.

Source For DQ KPIs Monitoring ProjectBig Query Data Ocean is the sole source for Quality checks after tables are loaded from their original sources.
Data Residing in Big Query

Data residing on Big Query brought in prj-data-dq-selfservice-dev project using the necessary views to be Scanned and checked by DataPlex, the generated metrics naturally reside in Big Query (Dataplex_quality)

Data Quality Metrics IngestionPopulating the BigQuery fact table with DQ metrics obtained from the previously mentioned sources By Mapping the resulting table to the developed DM (Using a stored procedure)
Big Query ViewsCreating the views necessary to answer Business requirements, providing them with degree of flexibility in a way that they can have more control over the data the need to query
according to business changing requirements, to approach selfservice.
Visualization

Uses QlikSense that connects to bigQuery views to ingest data, Use Qlik Sense to Visualize, present, add alerting capabilities for different business domains Different KPIs and Different Business Rules. and Failed records

Google Drive

Used to store failed records Data, then URL for the sheet is inserted in FACT_failed_records associated with quality_rule_key.


Data Model


Data Mapping:



Procedures

 Procedure Name: 


Scheduling:

Time of Runs and Duration Window:


4:00 - 5:00 CET

5:00 - 6:00 CET

6:00 - 9:00 CET

9:00 CET


Monitoring

Error Handling

Known Bugs

No Identified Bugs.

Roadmap