...
The source data is the gsheet from user, we load to big query (BQ) on project prj-data-dm-industrial-[env].
Talend
Source to ODS (DATA_OCEAN_DOMAIN_INDUSTRIAL)
J006_oee_gsheet_to_GCS
F006_Gsheet_OEE_TO_BQ
...
- Assigned extraction time and set the loading getting from last load from ODS_FIL_0000_F001_F_H_oee max meta_business_date
- Select data from ODS only the main fields regarding requirement (the rest of the measure except the column starting wit number 1 - 6 are kept in losses table)
- Delete Select delete data on DM.FACT_oee_main where the key same as the selected on ODSbefore load data from ODS because there is a case that user enter date 1/1/2024 and change to 01/01/2024, which cause duplicate records
- Load the data from point 2 to DM.FACT_oee_main
- Write the log
...
- Assigned extraction time and set the loading getting from last load from ODS_FIL_0000_F001_F_H_oee max meta_business_date
- Select data from ODS only the losses fields on percentage and unpivot the data regarding requirement
- Select data from ODS only the losses fields on volume and unpivot the data
- Select delete Delete data on DM.FACT_oee_main where the key same as the selected on ODSlosses before load data from ODS because there is a case that user enter date 1/1/2024 and change to 01/01/2024, which cause duplicate records
- Load the data from point 2 join with point 3 to DM.FACT_oee_losses
- Write the log
...
- DPL View names
- V_FACT_oee_mainlosses
- V_FACT_oee_lossesmain
Format
- columnar format
Sizing
...
1.2 Full load
PL_INDUS_OEE
Source to ODS : Always full load with column B to Z from line 4ODS
Variable to control:
l_VAR_GOOGLE_SHEET_oee_id = refer to Source session above
l_VAR_GOOGLE_SHEET_oee_tab = OEE (WIP)
l_VAR_GOOGLE_SHEET_oee_row_start = 4
l_VAR_GOOGLE_SHEET_oee_row_end = 0 means all the rows
ODS to DM : l_VAR_oee_main_dm_reload_condition = incremental or blank will load max meta_business_date from ODS
1.3. Reloading data
Source to ODS : just run the job again
ODS to DM : control by parameter l_VAR_oee_main_dm_reload_condition. This will impact both main and losses tables. To reload all ODS with latest data l_VAR_oee_main_dm_reload_condition =
QUALIFY ROW_NUMBER () OVER ( PARTITION BYconcat( date , gbu , site , unit , cluster) ORDER BY meta_ods_insert_date DESC ) = 1
...
