Page tree


You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 13 Next »

Job Details

As part of HR DQ project, we have developed the below Talend jobs which gets the data from SuccessFactor tables and ingest data into the respective STG/ODS tables. We are following the DataOcean standard jobs. 


Talend Job nameSuccessFactor Table nameTalend projectGCP ProjectSTG TableODS table
F010_SAP_SF_EmpJob_SAP_SF_to_ODSEmpJobDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_EmpJobODS_SFC_0000_F001_F_D_EmpJob
F009_SAP_SF_EmpEmployment_SAP_SF_to_ODSEmpEmploymentDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_EmpEmploymentODS_SFC_0000_F001_F_D_EmpEmployment
F005_SAP_SF_EmpEmploymentTermination_SAP_SF_to_ODSEmpEmploymentTerminationDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_EmpEmploymentTerminationODS_SFC_0000_F001_F_D_EmpEmploymentTermination
F006_SAP_SF_EmpPayCompRecurring_SAP_SF_to_ODSEmpPayCompRecurringDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_EmpPayCompRecurringODS_SFC_0000_F001_F_D_EmpPayCompRecurring
F007_SAP_SF_EmpCompensation_SAP_SF_to_ODSEmpCompensationDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_EmpCompensationODS_SFC_0000_F001_F_W_EmpCompensation
F008_SAP_SF_FOPayComponent_SAP_SF_to_ODSFOPayComponentDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_FOPayComponentODS_SFC_0000_F001_F_D_FOPayComponent
F004_SAP_SF_EmpJobRelationships_SAP_SF_to_ODSEmpJobRelationshipsDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_EmpJobRelationshipsODS_SFC_0000_F001_F_D_EmpJobRelationships
F001_SAP_SF_PerEmail_SAP_SF_to_ODSPerEmailDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_PerEmailODS_SFC_0000_F001_F_D_PerEmail
F002_SAP_SF_PerPersonal_SAP_SF_to_ODSPerPersonDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_PerPersonODS_SFC_0000_F001_F_D_PerPerson
F003_SAP_SF_PerPerson_SAP_SF_to_ODSPerPersonalDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_D_PerPersonalODS_SFC_0000_F001_F_D_PerPersonal
F011_SAP_SF_FOEventReason_csv_to_GCSFOEventReasonDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOEventReasonODS_SFC_0000_F001_F_W_FOEventReason
F012_SAP_SF_UserAccount_csv_to_GCSUserAccountDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_UserAccountODS_SFC_0000_F001_F_W_UserAccount
F014_SAP_FOLocationGroup_SAP_SF_to_ODSFOLocationGroupDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOLocationGroupODS_SFC_0000_F001_F_W_FOLocationGroup
F015_SAP_FOCompany_SAP_SF_to_ODSFOCompanyDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOCompanyODS_SFC_0000_F001_F_W_FOCompany
F016_SAP_FOCostCenter_SAP_SF_to_ODSFOCostCenterDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOCostCenterODS_SFC_0000_F001_F_W_FOCostCenter
F017_SAP_FOBusinessUnit_SAP_SF_to_ODSFOBusinessUnitDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOBusinessUnitODS_SFC_0000_F001_F_W_FOBusinessUnit
F018_SAP_PickListValueV2_SAP_SF_to_ODSPickListValueV2DATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_PickListValueV2ODS_SFC_0000_F001_F_W_PickListValueV2
F019_SAP_Position_SAP_SF_to_ODSPositionDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_PositionODS_SFC_0000_F001_F_W_Position
F020_SAP_EmpWorkPermit_SAP_SF_to_ODSEmpWorkPermitDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_EmpWorkPermitODS_SFC_0000_F001_F_W_EmpWorkPermit
F021_SAP_FOLocation_SAP_SF_to_ODSFOLocationDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_FOLocationODS_SFC_0000_F001_F_W_FOLocation
F012_SAP_User_SAP_SF_to_ODSUserDATA_OCEAN_DOMAIN_HRprj-data-dm-hr-prodSTG_SFC_0000_0000_F001_F_W_UserODS_SFC_0000_F001_F_W_User


Talend flow explanation


Main Flow



    • Setup meta_run_id and filename of the output file
    • Call the Rest API Job (J001*) which will invoke the SuccessFactor Rest API to extract the data (full) and load and save the file to GCS
    • Call the standard job to upload the files from GCS to ODS
    • If everything is OK, update the log. 

We are following the DataOcean standard Jobs except the J001* (SuccessFactor Rest API) job which is explained below

The above job invokes the SuccessFactor Rest API (by passing the username / password) to extract the data. Below are the main SuccessFactor parameters which are getting used while extraction, these parameters are common to all the SuccessFactor jobs.

l_VAR_SAP_SF_ENV : This parameter will contain DEV / PROD as the value, this is to mask the confidential information. If its DEV then the column will get populated with "" in the BQ tables, if PROD then it will have the source value in the target BQ. 
l_VAR_SAP_SF_LIMIT : This parameter is used to manage the number of records per iteration

l_VAR_SAP_SF_URL : This parameter is used to hold the SuccessFactore REST API URL

l_VAR_SAP_SF_PASSWORD : This parameter is used to store the password required to invoke the API
l_VAR_SAP_SF_USERNAME : This parameter is used to store the username required to invoke the API


Possibilities for the Talend job failure

As this is a full extract job and we are not doing any transformation (its 1:1 mapping), so the possibilities of failure could be one of the below 

  • STG / ODS table not present
  • Credentials issue (i.e. incorrect Username / Password to access the SuccessFactor RestAPI, in that case you can reach out to tania.pereira@solvay.com, she can help with the credentials)
  • RestAPI error : This could be if the API is not accessible
  • Talend RemoteServer Issue

Once the issue is fixed, we can rerun the job from start and it will create the file again and load the data and there will be no loss of data. 

******************************************************************************************************************************************************************************************

Data Validation/Troubleshoot queries

Data Validation:

<Provide the SQL queries to validate the data or the record count in the BQ target tables>

Logging:

<Details about the log tables in Big query>

Troubleshoot steps:

<Provide the steps to debug the ETL flow in case of failure. For example, if an ETL job fails, how should we trigger it? Can we directly rerun the job, or are there steps to delete the data from the previous execution before rerunning the job? or any change should be made in the context table for date executions>


Failed Records

As part of HR DQ project, we have developed the below Talend jobs about running some data quality validation queries and writing the results in Google Drive files.

Talend Job NameTalend ProjectGCP Project
F001_BQ_to_Gsheet_Drive_Selfservice 0.1DATA_OCEAN_DOMAIN_HR

prj-data-dq-selfservice-dev


J001_BQ_to_Gsheet_Drive_Selfservice 1.1DATA_OCEAN_DOMAIN_HR

prj-data-dq-selfservice-dev



Source

This job uses data quality queries stored in prj-data-dq-selfservice-dev.DM.Dataplex_quality table.