Description
The source data is the gsheet from user, we load to big query (BQ) on project prj-data-dm-industrial-[env].
Talend
Source to ODS
J006_oee_gsheet_to_GCS
F006_Gsheet_OEE_TO_BQ
- Read data from
- gsheet l_VAR_GOOGLE_SHEET_oee_id
- tabname l_VAR_GOOGLE_SHEET_oee_tab
- start row l_VAR_GOOGLE_SHEET_oee_row_start = 2
- end row l_VAR_GOOGLE_SHEET_oee_row_end = 0 means all rows
- Save data to bucket cs-ew1-prj-data-dm-industrial-dev-staging/Gsheet_OEE
- Load data to prj-data-dm-industrial-dev
- STG_FIL_0000_0000_F001_F_H_oee
- ODS_FIL_0000_F001_F_H_oee
ODS to DM
F016_FACT_oee_main (J016_FACT_oee_main)
- Assigned extraction time and set the loading getting from last load from ODS_FIL_0000_F001_F_H_oee max meta_business_date
- Select data from ODS only the main fields regarding requirement
- Select delete data on DM.FACT_oee_main where the key same as the selected on ODS
- Load the data from point 2 to DM.FACT_oee_main
- Write the log
F016_FACT_oee_losses (J016_FACT_oee_losses)
- Assigned extraction time and set the loading getting from last load from ODS_FIL_0000_F001_F_H_oee max meta_business_date
- Select data from ODS only the losses fields on percentage and unpivot the data regarding requirement
- Select data from ODS only the losses fields on volume and unpivot the data
- Select delete data on DM.FACT_oee_main where the key same as the selected on ODS
- Load the data from point 2 join with point 3 to DM.FACT_oee_losses
- Write the log
From
To
Access rights
Access to gsheet
Source
Format
- columnar format (Gsheet)
Destination
Location
- GCP = prj-data-dm-industrial-[env] → prj-data-industrial-dash-[env]
- STG table
- STG_FIL_0000_0000_F001_F_H_oee
- ODS table
- ODS_FIL_0000_F001_F_H_oee
- DM Table names
- FACT_oee_main
- FACT_oee_losses
- DPL View names
- V_FACT_oee_wip
- V_FACT_oee_main
- V_FACT_oee_losses
Format
- columnar format
Sizing
1,405 rows
Assessment
How to validate that the generated output is valid:
Data same as gsheet
Loading
1.1 Incremental Load
N/a
1.2 Full load
Just trigger the job GSheet-Refresher
1.3. Reloading data
Just trigger the job GSheet-Refresher
1.4 Plan to schedule
Every 30 minutes
1.5 Timing
The average time expected for loading: 1.5 minutes
Criticality
Low?
Logging
- Check on the GCP BQ project at "Scheduled queries" the name of the schedule is "GSheet-Refresher"







