The data source getting from prj-data-dm-industrial, which get from prj-data-dm-industrial-test.EDC.V_EDC_monthly_inventory_performance_data_finance_dio
tables in order to get the information of DIO.
Talend project DATA_OCEAN_DOMAIN_INDUSTRIAL
F001_GSheet_site_to_BQ_DM_DIO
Job:
(DIO)


Dataflow
prj-data-dm-industrial-test.EDC.V_EDC_monthly_inventory_performance_data_finance_dio → prj-data-dm-industrial-test.STG.STG_FIL_0000_0000_F001_F_M_dio → prj-data-dm-industrial-test.ODS.ODS_FIL_0000_F001_F_M_dio → prj-data-industrial-dash-test.ODS_DataOcean.V_ODS_dio → prj-data-industrial-dash-test.DM.FACT_dio → prj-data-industrial-dash-test.DPL.V_FACT_dio
It is required to access prj-data-industrial-dash-test project and prj-data-dm-industrial-test project.
BigQuery
Project = prj-data-dm-industrial-tes
BQ Dataset = EDC
BQ View = V_EDC_monthly_inventory_performance_data_finance_dio
columnar format
Dio
It is scheduled by plans below on WS_DATA_OCEAN_DOMAIN_INDUSTRIAL
3 tables are created from a Gsheet.
A Scheduled Query is then executed in BigQuery, which basically pivots various information from the tables and transforms them into a tabular view. From there, the input source is created to be consumed in Talend.
For more details, access the documentation below:
Scheduled Query Explanation
Overview
This SQL script automates the validation and materialization of multiple datasets in BigQuery . It follows a structured approach to:
The script processes three datasets separately:
Each dataset follows the same process, ensuring reliability and error handling.
Check the loading records
select job.job_name , job.meta_start_date , job.meta_execution_id , logs.meta_run_id , logs.meta_source_system , logs.meta_step , logs.meta_status , logs.meta_num_lines , logs.meta_error_lines from STG.log_tables logs join STG.run_jobs job on logs.meta_run_id = job.meta_run_id
where job_name like '%DIO%'
order by job.meta_start_date desc