Description:
BW HR query is used to determine the user authorization basing on the user profile on site. The source is query DO_BW_QRY_CPHRPANHR_0001
Tool: Xtract
Xtract server = ACEW1DXTRAXUS01
Xtract job TALEND_PROD_DO_BW_QRY_CPHRPANHR_0001. There is only 1 parameter, which is YYYYMM_Start=202403&YYYYMM_End=202403. It is required to enter start and end
Selected all the available fields on the query and generate the output to \\ACEW1DXTRAXUS01\extractions\TALEND_PROD_DO_BW_QRY_CPHRPANHR_0001.csv. It will trigger to generate file by Talend job
Tool: Talend
1. Detail job
- Call Xtract URL (tHttpRequest) in order to trigger the Xtract to generate file and pass parameter l_VAR_XTRACT_JOB from the flow job, which select only the current month
- Get the file from Xtract server
- Replace the column in case there is invalid character
- Generate the output file to \DATA_OCEAN_DOMAIN_HR\InOut
- Upload to GCS to cs-ew1-prj-data-dm-hr-[dev]-staging
- Delete the output file from local
2. Flow job
- Setup output file name
- Setup execute id, system id, start time
- Calculate the incremental load to get the current month by parameter l_VAR_XTRACT_PARA_CPHRPANHR. If it has value "currentmonth", it will get the current month of today. If not, it will use the value of this parameter to Xtract. The example of the parameter is "&YYYYMM_Start=202402&YYYYMM_End=202402". It will reload only for Feb 2024
- Generate meta_run_id
- Call detail job
- Load the data from GCS to ODS
- Update the log when the job is done.
3. Access rights
To access to BW, the Talend will connect to Xtract and Xtract require authorization of user RFC_TAL_WBP.
Most of the case for HR query, it is required to contact authorization team to add the authorization. In this case, role ZH_PA_ROBOT is modified in order to add the new query.
4. Source
BW query DO_BW_QRY_CPHRPANHR_0001
5. Format
Destination
Location
The data will keep in
Bucket = cs-ew1-prj-data-dm-hr-[env]-staging
DataOean GCP = prj-data-dm-hr-[env]
Product GCP = prj-data-sad-ebatch-[env]
To save the data into GCP Industrial Data Ocean, service account is required
Format
Same as the source
Sizing
Current month should have around 30,000 records
Assessment
How to validate that the generated output is valid: Compare with BW query DO_BW_QRY_CPHRPANHR_0001 and drill down all characteristics same as Data Dict
Loading
1.1 Full Load
Job | STG | ODS | Comment |
|---|---|---|---|
| F028_BW_QRY_CPHRPANHR_0001_to_ODS | STG_BWH_0000_0000_F001_F_D_qry_cphrpanhr_0001 | ODS_BWH_0000_F001_F_D_qry_cphrpanhr_0001 |
1.2. Reloading data
The job will have context parameter l_VAR_XTRACT_PARA_CPHRPANHR to change the selection of the BW query. Normally the value should be "currentmonth" in order to load only the current month. However, if the data from other months need to be reload, we can enter the context to be this context MUST BE "&YYYYMM_Start=202403&YYYYMM_End=202403" In this example, it will load only Mar 2024. If it is blank, it will get all the data in BW, which may cause the job error with memory.
1.3 Scheduling
PL_HR_EBATCH_BWH_CPHRPANHR_0001 run every Monday at 08:00 AM CET (weekly)
1.4 Timing
take time around 2 - 5 minutes loading.
Criticality
High
Logging
select job.job_name , job.meta_start_date , logs.meta_run_id , logs.meta_source_system , logs.meta_step , logs.meta_status , logs.meta_num_lines , logs.meta_error_lines from STG.log_tables logs join STG.run_jobs job on logs.meta_run_id = job.meta_run_id
where logs.meta_run_id in ( SELECT meta_run_id FROM STG.run_jobs order by meta_start_date desc limit 100 )
and meta_source_system like '%BWH%CPHRPANHR%'
and meta_start_date > DATE_SUB ( CURRENT_TIMESTAMP () , INTERVAL 1 DAY )
order by job.meta_start_date desc


