Description:

BW HR query is used to determine the user authorization basing on the user profile on site.  The source is query DO_BW_QRY_CPHRPANHR_0001

Tool: Xtract

Xtract server = ACEW1DXTRAXUS01

Xtract job TALEND_PROD_DO_BW_QRY_CPHRPANHR_0001.  There is only 1 parameter, which is YYYYMM_Start=202403&YYYYMM_End=202403. It is required to enter start and end

Selected all the available fields on the query and generate the output to \\ACEW1DXTRAXUS01\extractions\TALEND_PROD_DO_BW_QRY_CPHRPANHR_0001.csv. It will trigger to generate file by Talend job

Tool: Talend 

1. Detail job


  1. Call Xtract URL (tHttpRequest) in order to trigger the Xtract to generate file and pass parameter l_VAR_XTRACT_JOB from the flow job, which select only the current month
  2. Get the file from Xtract server
  3. Replace the column in case there is invalid character
  4. Generate the output file to \DATA_OCEAN_DOMAIN_HR\InOut
  5. Upload to GCS to cs-ew1-prj-data-dm-hr-[dev]-staging 
  6. Delete the output file from local

2. Flow job

  1. Setup output file name
  2. Setup execute id, system id, start time 
  3. Calculate the incremental load to get the current month by parameter l_VAR_XTRACT_PARA_CPHRPANHR.  If it has value "currentmonth", it will get the current month of today.  If not, it will use the value of this parameter to Xtract. The example of the parameter  is "&YYYYMM_Start=202402&YYYYMM_End=202402". It will reload only for Feb 2024
  4. Generate meta_run_id
  5. Call detail job 
  6. Load the data from GCS to ODS
  7. Update the log when the job is done.

3. Access rights

To access to BW, the Talend will connect to Xtract and Xtract require authorization of user RFC_TAL_WBP. 

Most of the case for HR query, it is required to contact authorization team to add the authorization. In this case, role ZH_PA_ROBOT is modified in order to add the new query.

4. Source

          BW query DO_BW_QRY_CPHRPANHR_0001

5. Format

Data Dict

Destination

Location

The data will keep in

Bucket = cs-ew1-prj-data-dm-hr-[env]-staging 

DataOean GCP = prj-data-dm-hr-[env]

Product GCP = prj-data-sad-ebatch-[env]

To save the data into GCP Industrial Data Ocean, service account is required

Format

Same as the source

Sizing

        Current month should have around 30,000 records 

Assessment

How to validate that the generated output is valid: Compare with BW query DO_BW_QRY_CPHRPANHR_0001 and drill down all characteristics same as  Data Dict

Loading

1.1 Full Load

Job

STG

ODS

Comment

F028_BW_QRY_CPHRPANHR_0001_to_ODSSTG_BWH_0000_0000_F001_F_D_qry_cphrpanhr_0001ODS_BWH_0000_F001_F_D_qry_cphrpanhr_0001

1.2. Reloading data

The job will have context parameter l_VAR_XTRACT_PARA_CPHRPANHR to change the selection of the BW query. Normally the value should be "currentmonth" in order to load only the current month. However, if the data from other months need to be reload, we can enter the context to be  this context MUST BE "&YYYYMM_Start=202403&YYYYMM_End=202403"  In this example, it will load only Mar 2024. If it is blank, it will get all the data in BW, which may cause the job error with memory.

1.3 Scheduling

PL_HR_EBATCH_BWH_CPHRPANHR_0001 run every Monday at  08:00 AM CET (weekly)

1.4 Timing

          take time around 2 - 5 minutes loading.

Criticality

High 

Logging

select   job.job_name ,  job.meta_start_date ,  logs.meta_run_id ,  logs.meta_source_system ,  logs.meta_step ,  logs.meta_status ,  logs.meta_num_lines ,  logs.meta_error_lines   from   STG.log_tables   logs   join   STG.run_jobs   job   on   logs.meta_run_id  =  job.meta_run_id
where    logs.meta_run_id   in   ( SELECT   meta_run_id   FROM   STG.run_jobs   order   by   meta_start_date   desc   limit   100 )
and   meta_source_system   like   '%BWH%CPHRPANHR%'
and   meta_start_date   >    DATE_SUB ( CURRENT_TIMESTAMP () ,  INTERVAL   1   DAY )
order   by   job.meta_start_date   desc