...
To maintain the reliability of the data quality pipeline, a structured error handling procedure is in place for each scheduled process. The following section outlines the recommended actions in case of failure for each of the weekly scheduled steps
In the event of a failure, it's crucial not only to resolve and rerun the failed step, but also to re-execute all subsequent steps in the pipeline — as they may have run on incomplete or outdated data.
For a full overview of how the data and processes flow together, please refer to the Architecture & Data Flow Diagram.
1. Talend Ingestion Jobs
What to check:
- Verify the Talend execution logs to identify the root cause.
- Confirm SAP source system connectivity.
- Check for schema changes in source systems that may have caused mapping errors.
- If a prior process failed, ensure all upstream steps have been rerun.
Next steps:
- Rerun the failed Talend job manually after resolving the issue.
- Re-execute all downstream processes: Data Quality Scans, Routines, Report Job, and QlikSense Refresh.
- Inform the Data Engineer in case further support is needed.
2. Data Quality Scans (Dataplex)
What to check:
- Access Dataplex logs to locate the failed rule or asset.
- Ensure that the source views in BigQuery are available and not empty.
- Confirm rule syntax and metadata configurations.
- If a prior process failed, ensure all upstream steps have been rerun.
Next steps:
- Re-execute the failed scan via the Dataplex UI or using a scheduled query.
- Re-run subsequent routines and the Talend Report Job to reflect updated quality results.
- If multiple rules fail, check if a shared dependency is broken.
- Contact the Data Architect for review if the failure is rule-related.
3. Routines Execution (BigQuery Scheduled Queries)- Data Model Dimension Tables
What to check:
- Review scheduled query logs in BigQuery for error messages.
- Validate that the input tables contain data for the current cycle.
- If a prior process failed, ensure all upstream steps have been rerun.
Next steps:
- Rerun the failed query manually.
- Re-execute the Talend Report Job and QlikSense Refresh to align downstream outputs.
- Fix any reference issues or update logic if the schema has changed.
- Escalate to the Data Engineer if the issue persists.
4. Routines Execution - Data Model Fact Tables
What to check:
- Review scheduled query logs in BigQuery for error messages.
- Validate that the input tables contain data for the current cycle.
- If a prior process failed, ensure all upstream steps have been rerun.
Next steps:
- Rerun the failed query manually.
- Re-execute the Talend Report Job and QlikSense Refresh to align downstream outputs.
- Fix any reference issues or update logic if the schema has changed.
- Escalate to the Data Engineer if the issue persists.
5. Talend Report Job Talend Report Job (PL_DQ_BQ_to_Gshet_Selfservice)
What to check:
- Review Talend logs to determine if the issue was during query execution, file generation, or upload to Google Drive.
- Confirm the existence and access permissions of the target Google Drive folder.
- If a prior process failed, ensure all upstream steps have been rerun.
Next steps:
- Manually generate and upload the failed records file if needed.
- Update the DM.FACT_failed_records table with the file URL manually if automation fails.
- Ensure the DM.FACT_failed_records table is updated with the correct file URL.
- Manually trigger the QlikSense Refresh afterward.
- Coordinate with the Talend support team.
56. QlikSense Refresh
What to check:
- Check QlikSense dashboard status and refresh logs.
- Ensure the data sources (BigQuery tables) are accessible.
- If a prior process failed, ensure all upstream steps have been rerun.
Next steps:
- Notify the Visualization Engineer to manually trigger the refresh.
- In case of missing data, trace the issue upstream (Talend, BigQuery, or Dataplex).
...
As of now, four deployments have been completed. Detailed documentation related to these deployments is available in the following Google Drive folder :
1uAQrdNMqfcu7uRSmkayrck0yAOJOesdO?usp=drive_linkGoogle Drive Live Link url https://drive.google.com/drive/folders/ 1IVOSue_RIYZkk6oKsBS8xHIoHP-i_oRQ
Known Bugs
Currently, no bugs have been identified in the system.