Page tree


You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 32 Next »

Job Details

As part of HR DQ project, we have developed the below Talend jobs which gets the data from SuccessFactor tables and ingest data into the respective STG/ODS tables. We are following the DataOcean standard jobs. 


Talend Job nameSuccessFactor Table nameTalend projectGCP ProjectSTG TableODS table
F010_SAP_SF_EmpJob_SAP_SF_to_ODSEmpJobDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_EmpJobODS_SFC_0000_F001_F_D_EmpJob
F009_SAP_SF_EmpEmployment_SAP_SF_to_ODSEmpEmploymentDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_EmpEmploymentODS_SFC_0000_F001_F_D_EmpEmployment
F005_SAP_SF_EmpEmploymentTermination_SAP_SF_to_ODSEmpEmploymentTerminationDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_EmpEmploymentTerminationODS_SFC_0000_F001_F_D_EmpEmploymentTermination
F006_SAP_SF_EmpPayCompRecurring_SAP_SF_to_ODSEmpPayCompRecurringDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_EmpPayCompRecurringODS_SFC_0000_F001_F_D_EmpPayCompRecurring
F007_SAP_SF_EmpCompensation_SAP_SF_to_ODSEmpCompensationDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_EmpCompensationODS_SFC_0000_F001_F_W_EmpCompensation
F008_SAP_SF_FOPayComponent_SAP_SF_to_ODSFOPayComponentDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_FOPayComponentODS_SFC_0000_F001_F_D_FOPayComponent
F004_SAP_SF_EmpJobRelationships_SAP_SF_to_ODSEmpJobRelationshipsDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_EmpJobRelationshipsODS_SFC_0000_F001_F_D_EmpJobRelationships
F001_SAP_SF_PerEmail_SAP_SF_to_ODSPerEmailDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_PerEmailODS_SFC_0000_F001_F_D_PerEmail
F002_SAP_SF_PerPersonal_SAP_SF_to_ODSPerPersonDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_PerPersonODS_SFC_0000_F001_F_D_PerPerson
F003_SAP_SF_PerPerson_SAP_SF_to_ODSPerPersonalDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_PerPersonalODS_SFC_0000_F001_F_D_PerPersonal
F011_SAP_SF_FOEventReason_csv_to_GCSFOEventReasonDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOEventReasonODS_SFC_0000_F001_F_W_FOEventReason
F012_SAP_SF_UserAccount_csv_to_GCSUserAccountDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_UserAccountODS_SFC_0000_F001_F_W_UserAccount
F014_SAP_FOLocationGroup_SAP_SF_to_ODSFOLocationGroupDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOLocationGroupODS_SFC_0000_F001_F_W_FOLocationGroup
F015_SAP_FOCompany_SAP_SF_to_ODSFOCompanyDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOCompanyODS_SFC_0000_F001_F_W_FOCompany
F016_SAP_FOCostCenter_SAP_SF_to_ODSFOCostCenterDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOCostCenterODS_SFC_0000_F001_F_W_FOCostCenter
F017_SAP_FOBusinessUnit_SAP_SF_to_ODSFOBusinessUnitDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOBusinessUnitODS_SFC_0000_F001_F_W_FOBusinessUnit
F018_SAP_PickListValueV2_SAP_SF_to_ODSPickListValueV2DATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_PickListValueV2ODS_SFC_0000_F001_F_W_PickListValueV2
F019_SAP_Position_SAP_SF_to_ODSPositionDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_PositionODS_SFC_0000_F001_F_W_Position
F020_SAP_EmpWorkPermit_SAP_SF_to_ODSEmpWorkPermitDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_EmpWorkPermitODS_SFC_0000_F001_F_W_EmpWorkPermit
F021_SAP_FOLocation_SAP_SF_to_ODSFOLocationDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOLocationODS_SFC_0000_F001_F_W_FOLocation
F012_SAP_User_SAP_SF_to_ODSUserDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_UserODS_SFC_0000_F001_F_W_User
F022_SAP_empCompensationGroupSumCalculatedNav_SAP_SF_to_ODSempCompensationGroupSumCalculatedNavDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_empCompensationGroupSumCalculatedNavODS_SFC_0000_F001_F_W_empCompensationGroupSumCalculatedNav


As part of Procurement DQ project, we have developed the below Talend jobs which gets the data from SAP (WP1 and PF1) tables and ingest data into the respective STG/ODS tables. We are following the DataOcean standard jobs. 

Talend Job nameSource Table nameTalend projectGCP ProjectSTG TableODS table
F001_SPW_F001_F_W_A017_TO_BQA017DATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPW_0000_0000_F001_F_W_A017ODS_SPW_0000_F001_F_W_A017
F001_SPW_F001_F_W_A018_TO_BQA018DATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPW_0000_0000_F001_F_W_A018ODS_SPW_0000_F001_F_W_A018
F001_SPW_F001_F_W_WYT3_TO_BQWYT3DATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPW_0000_0000_F001_F_W_WYT3ODS_SPW_0000_F001_F_W_WYT3
F001_SPW_F001_F_W_NACH_TO_BQNACHDATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPW_0000_0000_F001_F_W_NACHODS_SPW_0000_F001_F_W_NACH
F001_SPW_F001_F_W_LFB1_TO_BQLFB1DATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPW_0000_0000_F001_F_W_LFB1ODS_SPW_0000_F001_F_W_LFB1
F001_SPW_F001_F_W_LFM1_TO_BQLFM1DATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPW_0000_0000_F001_F_W_LFM1ODS_SPW_0000_F001_F_W_LFM1
F001_SPW_F001_F_W_LFZA_TO_BQLFZADATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPW_0000_0000_F001_F_W_LFZAODS_SPW_0000_F001_F_W_LFZA
F001_SPW_F001_F_W_T001_TO_BQT001DATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPW_0000_0000_F001_F_W_T001ODS_SPW_0000_F001_F_W_T001
F001_SPW_F001_F_W_T024E_TO_BQT024EDATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPW_0000_0000_F001_F_W_T024EODS_SPW_0000_F001_F_W_T024E
F001_SPW_F001_F_W_T042L_TO_BQT042LDATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPW_0000_0000_F001_F_W_T042LODS_SPW_0000_F001_F_W_T042L
F001_SPW_F001_F_W_BSIK_TO_BQBSIKDATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPW_0000_0000_F001_F_W_BSIKODS_SPW_0000_F001_F_W_BSIK
F001_SPW_F001_I_W_EKPA_TO_BQEKPADATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPW_0000_0000_F001_I_W_EKPAODS_SPW_0000_F001_I_W_EKPA
F001_SPW_F001_I_W_VTTK_TO_BQVTTKDATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPW_0000_0000_F001_I_W_VTTKODS_SPW_0000_F001_I_W_VTTK
F001_SPW_F001_F_W_A018_TO_BQA018DATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPW_0000_0000_F001_F_W_A018ODS_SPW_0000_F001_F_W_A018
F001_SPF_F001_F_W_EKPA_TO_BQEKPADATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_F_W_EKPAODS_SPF_0000_F001_F_W_EKPA
F001_SPF_F001_F_W_WYT3_TO_BQWYT3DATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_F_W_WYT3ODS_SPF_0000_F001_F_W_WYT3
F001_SPF_F001_F_W_T024E_TO_BQT024EDATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_F_W_T024EODS_SPF_0000_F001_F_W_T024E
F001_SPF_F001_F_W_LFBK_TO_BQLFBKDATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_F_W_LFBKODS_SPF_0000_F001_F_W_LFBK
F001_SPF_F001_F_W_ZZRSEGT_T_TO_BQZZRSEGT_TDATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_F_W_ZZRSEGT_TODS_SPF_0000_F001_F_W_ZZRSEGT_T
F001_SPF_F001_F_W_WYT3_TO_BQWYT3DATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_F_W_WYT3ODS_SPF_0000_F001_F_W_WYT3
F001_SPF_F001_F_W_EKPA_TO_BQEKPADATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_F_W_EKPAODS_SPF_0000_F001_F_W_EKPA
F001_SPF_F001_F_W_NACH_TO_BQNACHDATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_F_W_NACHODS_SPF_0000_F001_F_W_NACH
F001_SPF_F001_F_W_LFB1_TO_BQLFB1DATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_F_W_LFB1ODS_SPF_0000_F001_F_W_LFB1
F001_SPF_F001_F_W_LFM1_TO_BQLFM1DATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_F_W_LFM1ODS_SPF_0000_F001_F_W_LFM1
F001_SPF_F001_F_W_LFZA_TO_BQLFZADATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_F_W_LFZAODS_SPF_0000_F001_F_W_LFZA
F001_SPF_F001_F_W_T001_TO_BQT001DATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_F_W_T001ODS_SPF_0000_F001_F_W_T001
F001_SPF_F001_F_W_T024E_TO_BQT024EDATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_F_W_T024EODS_SPF_0000_F001_F_W_T024E
F001_SPF_F001_F_W_T042L_TO_BQT042LDATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_F_W_T042LODS_SPF_0000_F001_F_W_T042L
F001_SPF_F001_F_W_ZZRSEGT_T_TO_BQZZRSEGT_TDATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_F_W_ZZRSEGT_TODS_SPF_0000_F001_F_W_ZZRSEGT_T
F001_SPF_F001_F_W_BSIK_TO_BQBSIKDATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_F_W_BSIKODS_SPF_0000_F001_F_W_BSIK
F001_SPF_F001_I_W_VTTK_TO_BQVTTKDATA_OCEAN_DOMAIN_PROCUREMENTprj-data-dm-procurement-prodSTG_SPF_0000_0000_F001_I_W_VTTKODS_SPF_0000_F001_I_W_VTTK

Talend flow explanation


Main Flow



    • Setup meta_run_id and filename of the output file
    • Call the Rest API Job (J001*) which will invoke the SuccessFactor Rest API to extract the data (full) and load and save the file to GCS
    • Call the standard job to upload the files from GCS to ODS
    • If everything is OK, update the log. 

We are following the DataOcean standard Jobs except the J001* (SuccessFactor Rest API) job which is explained below

The above job invokes the SuccessFactor Rest API (by passing the username / password) to extract the data. Below are the main SuccessFactor parameters which are getting used while extraction, these parameters are common to all the SuccessFactor jobs.

l_VAR_SAP_SF_ENV : This parameter will contain DEV / PROD as the value, this is to mask the confidential information. If its DEV then the column will get populated with "" in the BQ tables, if PROD then it will have the source value in the target BQ. 
l_VAR_SAP_SF_LIMIT : This parameter is used to manage the number of records per iteration

l_VAR_SAP_SF_URL : This parameter is used to hold the SuccessFactore REST API URL

l_VAR_SAP_SF_PASSWORD : This parameter is used to store the password required to invoke the API
l_VAR_SAP_SF_USERNAME : This parameter is used to store the username required to invoke the API


Possibilities for the Talend job failure

As this is a full extract job and we are not doing any transformation (its 1:1 mapping), so the possibilities of failure could be one of the below 

  • STG / ODS table not present
  • Credentials issue (i.e. incorrect Username / Password to access the SuccessFactor RestAPI, in that case you can reach out to tania.pereira@solvay.com, she can help with the credentials)
  • RestAPI error : This could be if the API is not accessible
  • Talend RemoteServer Issue

Once the issue is fixed, we can rerun the job from start and it will create the file again and load the data and there will be no loss of data. 

******************************************************************************************************************************************************************************************

Data Validation/Troubleshoot queries

Data Validation:

<Provide the SQL queries to validate the data or the record count in the BQ target tables>

Logging:

<Details about the log tables in Big query>

Troubleshoot steps:

<Provide the steps to debug the ETL flow in case of failure. For example, if an ETL job fails, how should we trigger it? Can we directly rerun the job, or are there steps to delete the data from the previous execution before rerunning the job? or any change should be made in the context table for date executions>


Failed Records

Description

As part of HR DQ project, we have developed the below Talend jobs about running some data quality validation queries and writing the results in Google Drive files.


Talend Job NameTalend ProjectGCP Project
F001_BQ_to_Gsheet_Drive_Selfservice 0.1DATA_OCEAN_DOMAIN_HR

prj-data-dq-selfservice-dev


J001_BQ_to_Gsheet_Drive_Selfservice 1.1DATA_OCEAN_DOMAIN_HR

prj-data-dq-selfservice-dev



Source

This job uses data quality queries stored in prj-data-dq-selfservice-dev.DM.Dataplex_quality table and 

`prj-data-dq-selfservice-dev.DataOcean_dataquality_kpi` schema.


Destination

The data quality quality executed queries are materialized in Google Drive Failed Records folder.

Drive Id:

Main Flow

F001_BQ_to_Gsheet_Drive_Selfservice


This job points to J001_BQ_to_Gsheet_Drive_SelfService job, which has all the detailed business rules.

J001_BQ_to_Gsheet_Drive_Selfservice

P1


P2

  1. Set meta_run_id
  2. Get Data Quality SQL queries to run / Run query and store results in local file
  3. Set internal variables
  4. Get the Google Drive domain code folder Id
  5. For every local file generated in step 2, the process uploads the file to Google Drive Rule Name folder. This flow has a sub-process to create new rule name folders
  6. Write Talend Job Log

Plan to Schedule

From Monday to Friday at 08am

Logs

select *
from `STG.run_jobs`
where job_name = 'J001_BQ_to_Gsheet_Drive_Selfservice'
order by meta_end_date desc