Page tree


Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Description

The data source is the Pure server (ftp.credit360.com), which Talend will load the file via FTP (more detail from PURE click on this link).  The Talend project EHS_PURE load these data first and keep to GCP project solvay-ind-conso-[env] on the dataset ehs_pure_[env]_mig, which is existing and not part of Operational Dashboard project. 

Source FTP server = "ftp.credit360.com"  / User = "solvay" by using Private key file and it keep in remote engine GCP at this folder \DATA\DEV\EHS\Pure\InOut\pure_sftp_ssh_key (control by context variable l_CNX_EHS_PURE_SFTP_private_key )

...

Then, Operation Dashboard project using these data to prj-data-industrial-dash-[dev] project by create views

GCP dataset = solvay-ind-conso-dev.DS_prj_data_industrial_dash

After that Talend job in project IND_DASHBOARD generate the FACT tables for TRII and PSE by separate the perspective by site and gbu.

OS =  Occupational safety incidents

PS = Process Safety

Tools: Talend

Detail job

  • J060_Helix_Case_to_GCS

Image Removed

  1. Connect to the source system API by reading context from flow job
  2. Setup loop to get the data
    1. tSetGlobalVar : to set the maximum number of records to read each time and set the variable nb to check when to exit the loop (start with 0)
    2. tLoop : setup the condition to exit the loop when variable nb < 0
    3. tJava: setup the offset of records in order to get new records of each loop
  3. To get data from the source by using start row number from "nb" and max row number from "limit".  It read schema from the source(meta data)
  4. Generate output file and save to DATA\DEV\DATA_OCEAN_DOMAIN_DT\Tmp
  5. Update the offset number "nb" = "nb" + "limit"
  6. Update "nb" = -1 when ((Integer)globalMap.get("tReplace_1_NB_LINE"))<= 0  in order to exit the loop
  7. Upload the files all the folder( cs-ew1-prj-data-dm-dt-[dev]-staging)
  8. Delete all the files in the folder (point number 5)

Flow job

  • F060_Helix_Case

Image Removed

    • Setup meta_run_id and filename of the output file
    • Get the last load from table STG.incremetnal_load, control by the variable I_VAR_BQ_TABLE_INC_LOAD  and configuration the logic of the incremental load in tJava to use the date from incremental_load to the field of create or change date in the SAP
    • Call the detail job and pass parameters such as user/password, query from point number 2 to do the incremental load and save the file to GCS
    • Call the standard job to upload the files from GCS to ODS
    • If the loading is OK and parameter l_VAR_heliux_[table_name]_reload = incremental, update the time on the table incremental_load. If the value is not incremental, it is the reloading
    • If everything is OK, update the log. 

Access rights

Source

Format

  • JSON

Destination

Location

  • Bucket = cs-ew1-prj-data-dm-dt-[dev]-staging/xxx
  • DataOean GCP = prj-data-dm-dt-[env]
  • STG Table name =  prj-data-dm-dt-[env].STG.STG_HLX_0000_0000_F001_I_H_Cases
  • ODS Table name =  prj-data-dm-dt-[Env].ODS.ODS_HLX_0000_F001_I_H_Cases
  • DPL View name  = prj-data-dm-dt-[env].DPL.V_FACT_hlx_case

Format

  • columnar format

Sizing

Assessment

How to validate that the generated output is valid: 

Loading

1.1 Incremental Load

1.2 Full load

1.3. Reloading data

1.4 Plan to schedule

1.5 Timing

The average time expected for  loading:

Criticality

High/Medium/Low

...

V_core_hd_monthly  → prj-data-industrial-dash-[env].DataOcean_solvay_conso.V_core_hd_monthly → DPL.V_core_hd_monthly

v_core_hd_quarterly →  prj-data-industrial-dash-[env].DataOcean_solvay_conso.V_core_hd_quarterly → DPL.V_core_hd_quarterly

V_os_data →  prj-data-industrial-dash-[env].DataOcean_solvay_conso.V_os_data → DPL.V_os_datat

V_ts_data →  prj-data-industrial-dash-[env].DataOcean_solvay_conso.V_ts_data → DPL.V_ts_data

V_ps_data →  prj-data-industrial-dash-[env].DataOcean_solvay_conso.V_ps_data → DPL.V_ps_data

NOTE:

  • OP = Occupational Safety
  • PS = Process Safety
  • TS = Transport Safety