Page tree


You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Description

The data source getting from prj-data-dm-industrial, which get from prj-data-dm-industrial-test.EDC.V_EDC_monthly_inventory_performance_data_finance_dio

tables in order to get the information of DIO.

Tools: Talend

Talend project DATA_OCEAN_DOMAIN_INDUSTRIAL

  • F001_GSheet_site_to_BQ_DM_DIO

From source to ODS 

Job:

  • F001_GSheet_site_to_BQ_DM_DIO

(DIO)

  1. Define variable and Generate meta_run_id
  2. Call reference job to load from EDC.V_EDC_monthly_inventory_performance_data_finance_dio` and save the output file to bucket cs-ew1-prj-data-dm-industrial-test-staging/dio
  3. Load from Bucket to STG and ODS
  4. Move data from ODS to DM
  5. Update log

From DM to Operational Dashboard

Dataflow

prj-data-dm-industrial-test.EDC.V_EDC_monthly_inventory_performance_data_finance_dio → prj-data-dm-industrial-test.STG.STG_FIL_0000_0000_F001_F_M_dio  → prj-data-dm-industrial-test.ODS.ODS_FIL_0000_F001_F_M_dio → prj-data-industrial-dash-test.ODS_DataOcean.V_ODS_dio → prj-data-industrial-dash-test.DM.FACT_dio → prj-data-industrial-dash-test.DPL.V_FACT_dio

Access rights

It is required to access prj-data-industrial-dash-test project and prj-data-dm-industrial-test project.

Source

BigQuery

Project = prj-data-dm-industrial-tes

BQ Dataset = EDC

BQ View = V_EDC_monthly_inventory_performance_data_finance_dio

Destination

DataOcean

  • Bucket = cs-ew1-prj-data-dm-industrial-test-staging/
    • dio
    • FIL_IND_0000_0000_F001_20250314122333_0000_F_M_dio.csv
  • STG Table names = STG_FIL_0000_0000_F001_F_M_dio
  • ODS Table names = ODS_FIL_0000_F001_F_M_dio

Product

  • GCP = prj-data-industrial-dash-[env]
  • DataOcean
    • V_FACT_dio

Format

columnar format

Sizing

  • STG_FIL_0000_0000_F001_F_M_fc         around 17,392 records

Loading

1.1 Full load

  • Dio Source to ODS :TASK_IND_DAILY_Dio

1.2. Reloading data

Dio

  • Just run the TASK_IND_DAILY_Dio.

1.3 Plan to schedule

It is scheduled by plans below on WS_DATA_OCEAN_DOMAIN_INDUSTRIAL

  • TASK_IND_DAILY_Dio - Every 30 minuntes and start at 06:00am UTC until 06:00pm

1.4 Timing

  • Case 5 minutes from source to ODS (full)

Criticality

  • Low?

Logging

Check the loading records 

select job.job_name, job.meta_start_date, job.meta_execution_id, logs.meta_run_id, logs.meta_source_system, logs.meta_step, logs.meta_status, logs.meta_num_lines, logs.meta_error_lines from STG.log_tables logs join STG.run_jobs job on logs.meta_run_id = job.meta_run_id
where  logs.meta_run_id in (SELECT meta_run_id FROM STG.run_jobs order by meta_start_date desc limit 1000)
and job_name in ('F001_GSheet_site_to_BQ_DM_DIO')
order by job.meta_start_date desc