...
The data source getting from prj-data-dm-industrial, which get from prj-data-dm-industrial-test.EDC.V_EDC_monthly_inventory_performance_data_finance_dio
...
- F001_GSheet_site_to_BQ_DM_DIO
(DIO)
- Define variable and Generate meta_run_id
- Call reference job to load from EDC.V_EDC_monthly_inventory_performance_data_finance_dio` and save the output file to bucket cs-ew1-prj-data-dm-industrial-test-staging/dio
- Load from Bucket to STG and ODS
- Update log
From ODS to DM
- Load context from the RDS to be used in talend
- Load Move data from ODS to DM
- Update log
From DM to Operational Dashboard
...
Access rights
It is required to access prj-data-industrial-dash-test project and prj-data-dm-industrial-test project.
...
1.1 Full load
- Dio Source to ODS : TASKPL_DM_IND_DAILY_Dio
1.2. Reloading data
Dio
- Just run the TASKPL_DM_IND_DAILY_Dio.
1.3 Plan to schedule
It is scheduled by plans below on WS_DATA_OCEAN_DOMAIN_INDUSTRIAL
...
- Case 5 minutes from source to ODS (full)
1.5 Pre Requisites
3 tables are created from a Gsheet.
A Scheduled Query is then executed in BigQuery, which basically pivots various information from the tables and transforms them into a tabular view. From there, the input source is created to be consumed in Talend.
For more details, access the documentation below:
Scheduled Query Explanation
Overview
This SQL script automates the validation and materialization of multiple datasets in BigQuery . It follows a structured approach to:
- Validate the source data – Ensuring the dataset is accessible and contains valid data.
- Drop and recreate materialized tables – Removing outdated tables and creating new ones with additional metadata.
- Enhance traceability – Adding metadata fields such as row identifiers, timestamps, and Google Sheets references.
The script processes three datasets separately:
- Wave Project Star (wave_project_star)
- Wave Activity Star (wave_activity_star)
- Monthly Inventory Performance - Data Finance (monthly_inventory_performance_data_finance)
Each dataset follows the same process, ensuring reliability and error handling.
Criticality
- Low?
Logging
Check the loading records
select job.job_name , job.meta_start_date , job.meta_execution_id , logs.meta_run_id , logs.meta_source_system , logs.meta_step , logs.meta_status , logs.meta_num_lines , logs.meta_error_lines from STG.log_tables logs join STG.run_jobs job on logs.meta_run_id = job.meta_run_id
where logs.meta_run_idin(SELECTmeta_run_idFROMSTG.run_jobsorderbymeta_start_datedesclimit1000)
and job_namein('F001_GSheet_site_to_BQ_DM_DIO') like '%DIO%'
order by job.meta_start_date desc
...


