Page tree


Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Initially "On Demand" for testing purposes and then "Scheduled" to run every week within Dataplex.

3. Routines Execution -  Data Model Dimension Tables

The 3 2 routines are triggered using scheduled queries on a weekly basis within BigQuery.

  • prj-data-dq-selfservice-[env].DM.insert_DIM_Domain
  • prj-data-dq-selfservice-[env].DM.insert_DIM_kpi_dimension

[env] is one of the following: dev, test, ppd, prod

4. Routines Execution - Data Model Fact Tables

The routine is triggered using scheduled queries on a weekly basis within BigQuery.

  • prj-data-dq-selfservice-[env].DM.RT_DPtoDMmapping_Datespecific

[env] is one of the following: dev, test, ppd, prod

45. Talend Report Job

The Talend Job PL_DQ_BQ_to_Gshet_Selfservice is scheduled to run within Talend every week, at the end of the process.

56. QlikSense Refresh

The QlikSense refresh schedule is set set by the Visualization Engineer within QlikSense.

Process Scheduling Details

Bellow you can find a table that summarizes the processes, their frequency, duration window and average duration.

...

To maintain the reliability of the data quality pipeline, a structured error handling procedure is in place for each scheduled process. The following section outlines the recommended actions in case of failure for each of the weekly scheduled steps 

In the event of a failure, it's crucial not only to resolve and rerun the failed step, but also to re-execute all subsequent steps in the pipeline — as they may have run on incomplete or outdated data.

For a full overview of how the data and processes flow together, please refer to the Architecture & Data Flow Diagram.

1. Talend Ingestion Jobs

What to check:

  • Verify the Talend execution logs to identify the root cause.
  • Confirm SAP source system connectivity.
  • Check for schema changes in source systems that may have caused mapping errors.
  • If a prior process failed, ensure all upstream steps have been rerun.

Next steps:

  • Rerun the failed Talend job manually after resolving the issue.
  • Re-execute all downstream processes: Data Quality Scans, Routines, Report Job, and QlikSense Refresh.
  • Inform the Data Engineer in case further support is needed.

2. Data Quality Scans (Dataplex)

What to check:

  • Access Dataplex logs to locate the failed rule or asset.
  • Ensure that the source views in BigQuery are available and not empty.
  • Confirm rule syntax and metadata configurations.
  • If a prior process failed, ensure all upstream steps have been rerun.

Next steps:

  • Re-execute the failed scan via the Dataplex UI or using a scheduled query.
  • Re-run subsequent routines and the Talend Report Job to reflect updated quality results.
  • If multiple rules fail, check if a shared dependency is broken.
  • Contact the Data Architect for review if the failure is rule-related.

3. Routines Execution (BigQuery Scheduled Queries)- Data Model Dimension Tables

What to check:

  • Review scheduled query logs in BigQuery for error messages.
  • Validate that the input tables contain data for the current cycle.
  • If a prior process failed, ensure all upstream steps have been rerun.

Next steps:

  • Rerun the failed query manually.
  • Re-execute the Talend Report Job and QlikSense Refresh to align downstream outputs.
  • Fix any reference issues or update logic if the schema has changed.
  • Escalate to the Data Engineer if the issue persists.

4. Routines Execution - Data Model Fact Tables

What to check:

  • Review scheduled query logs in BigQuery for error messages.
  • Validate that the input tables contain data for the current cycle.
  • If a prior process failed, ensure all upstream steps have been rerun.

Next steps:

  • Rerun the failed query manually.
  • Re-execute the Talend Report Job and QlikSense Refresh to align downstream outputs.
  • Fix any reference issues or update logic if the schema has changed.
  • Escalate to the Data Engineer if the issue persists.

5. Talend Report Job Talend Report Job (PL_DQ_BQ_to_Gshet_Selfservice)

What to check:

  • Review Talend logs to determine if the issue was during query execution, file generation, or upload to Google Drive.
  • Confirm the existence and access permissions of the target Google Drive folder.
  • If a prior process failed, ensure all upstream steps have been rerun.

Next steps:

  • Manually generate and upload the failed records file if needed.
  • Update the DM.FACT_failed_records table with the file URL manually if automation fails.
  • Ensure the DM.FACT_failed_records table is updated with the correct file URL.
  • Manually trigger the QlikSense Refresh afterward.
  • Coordinate with the Talend support team.

56. QlikSense Refresh

What to check:

  • Check QlikSense dashboard status and refresh logs.
  • Ensure the data sources (BigQuery tables) are accessible.
  • If a prior process failed, ensure all upstream steps have been rerun.

Next steps:

  • Notify the Visualization Engineer to manually trigger the refresh.
  • In case of missing data, trace the issue upstream (Talend, BigQuery, or Dataplex).

...

As of now, four deployments have been completed. Detailed documentation related to these deployments is available in the following Google Drive folder :

Google Drive Live Link
urlhttps://drive.google.com/drive/folders/
1uAQrdNMqfcu7uRSmkayrck0yAOJOesdO?usp=drive_link
1IVOSue_RIYZkk6oKsBS8xHIoHP-i_oRQ

Known Bugs

Currently, no bugs have been identified in the system.

Roadmap

FSD

TSD