Page History

...

Architecture - High Level Design (HLD)

Architecture - Low Level Design (LLD)

Architecture Data Flow

Google Drive Live Link

url	https://

docs

drive.google.com/

spreadsheets

file/d/

1gXHCRA03T5pH3wC--erjuT8DjyxIlhEZthyeCCNZXgI/edit#gid=1684414561

1274-2h5sL4Cyjsot-UgMYo03i2B8Q_hHzOv-Z3ixglo/view

DataPrep Flow

Schema showing the different STEPS of the application flow - with the data involved at each step

Steps descriptions on Big Query

Describe the data and process involved at each step

SAP PF1

Description

This project has the data from SAP on QM, material document and Vendor master data tables loading into Industrial data ocean for all data in PF1 basing on the create and modification date. Then, the view on top will filter only Soda Ash data.

Note: Soda ash data has only in PF1, therefore, this project will load only from PF1. (It is not include WP1)

Tools

Talend:

1. Loading

1.1 Incremental Load

The main job is F100_SPF_IND_QM_Main to run all the incremental load, which include the loading of table MCH1, MCHA, QALS, AFVC, QAMR, QAMV, QAVE, MSEG, PLPO in sequence in order to limit the number of background job in SAP.

Image Removed

...

MCH1

...

STG_SPF_0000_0000_F001_I_H_mch1

...

ERSDA - Create date

...

LAEDA - Modify date

...

MCHA

...

STG_SPF_0000_0000_F001_I_H_mcha

...

ERSDA- Create date

...

LAEDA - Modify date

...

MSEG

...

STG_SPF_0000_0000_F001_I_H_mseg

...

CPUDT_MKPF- Create date

...

/BEV2/ED_AEDAT - Modify date

...

QALS

...

STG_SPF_0000_0000_F001_I_H_qals

...

ERSTELDAT- Create date

...

AENDERDAT - Modify date

...

QAMR

...

STG_SPF_0000_0000_F001_I_H_qamr

...

ERSTELLDAT- Create date

...

AENDERDAT - Modify date

...

QAMV

...

STG_SPF_0000_0000_F001_I_H_qamv

...

ERSTELLDAT- Create date

...

AENDERDAT - Modify date

...

STG_SPF_0000_0000_F001_I_H_qasr

...

ERSTELLDAT- Create date

...

AENDERDAT - Modify date

...

QAVE

...

STG_SPF_0000_0000_F001_I_H_qave

...

VDATUM- Create date

...

VAEDATUM - Modify date

...

AFVC

...

STG_SPF_0000_0000_F001_I_H_afvc

...

Get list of AUFPL from last load of QALS

The job AFVC is required to run after QALS because the job will select the list of AUFPL (Routing number of operations in the order) from last load of QALS (max meta_business_date). In case of reload, it will rely on QALS table as well.

1.2 Full Load

Job F101_SPF_IND_QM_Main_Full will manage the full load job.

...

LFA1

...

STG_SPF_0000_0000_F001_F_H_lfa1

...

QPAC

...

STG_SPF_0000_0000_F001_F_H_qpac

...

QPAM

...

STG_SPF_0000_0000_F001_F_H_qpam

...

QPCD

...

STG_SPF_0000_0000_F001_F_H_qpcd

...

QPCT

...

STG_SPF_0000_0000_F001_F_H_qpct

...

2. Reloading data

All the job will have context parameter l_VAR_eBatch_PF1_[TableName]_additional_filter to change the selection when extract the SAP. For incremental load, this context MUST BE "incremental". If it is blank, it will get data from 2023.

Note: the incremental will check only date not the time.

Example of context reload

l_VAR_eBatch_PF1_AFVC_additional_filter = AUFPL >= '1007995974' and AUFPL <= '1009027526'

l_VAR_eBatch_PF1_MCH1_additional_filter = ERSDA > '20230101'

Access rights

To access to SAP, the Talend user RFC_TAL_PF1 is required

Source

pf1nonha.eua.solvay.com

Format

The format of the source data

Destination

Location

The data will keep in
Bucket = cs-ew1-prj-data-dm-industrial-[dev]-staging and it will keep in each folder for each table
Image Removed
DataOean GCP = prj-data-dm-industrial-[env]
Product GCP = prj-data-sad-ebatch-[env]
To save the data into GCP Industrial Data Ocean, service account is required

Format

Same as the source

Sizing

Expected data volume for full load (as of May 2024)

Table	Table Name	Size
LFA1	Vendor Master (General Section)	229,553
QPAC	Inspection catalog codes for selected sets	2,314
QPAM	Inspection catalog selected sets	741
QPCD	Inspection catalog codes	26,828
QPCT	Code texts	228,149

The rest will be incremental hourly

Image Removed

Assessment

How to validate that the generated output is valid: Compare with table in PF1

Scheduling

PL_INDUS_EBATCH_SPF_QM_INC_LOAD (incremental load) run every 1 hour weekday.
PL_INDUS_EBATCH_SPF_QM_FULL_LOAD (full load) run every workday at 08:00 AM CET (daily)

Timing

The average time expected for :
full process - 7 - 15 min
incremental process : 5 - 10 min

Criticality

High

Logging

Industrial

1. Table log: This sql will get the status log that have job name from source system SPF (PF1)

...

2. Incremental table: Every time that the job loading complete, max time from staging table will update this table. Therefore, it is easier to know which table is not updated.

Table: STG.incremental_loading

Image Removed

SAP BW HR

Description: BW HR query is used to determine the user authorization basing on the user profile on site. The source is query DO_BW_QRY_CPHRPANHR_0001 and it is manage by Xtract job TALEND_PROD_DO_BW_QRY_CPHRPANHR_0001 on Xtract server

ACEW1DXTRAXUS01. There is only 1 parameter, which is YYYYMM_Start=202403&YYYYMM_End=202403. It is required to enter start and end