Description

Wave is a external tool that calculate cost of saving on each project. Talend can get the data via API of this server and the output of the API is excel file.

It is required to create a connection and setup the access time to allow Talend read the data. If Talend access out side of the allow time, it will not get the file.

The password is also required to reset every 6 months since the password will be expired

Wave Connection

How to create connection

1. Login https://solvay.mckinseywave.com/login/auth

2. Go to Data Connection

3. Click on ADD DATA CONNECTION

4. Enter the name of the connection and data/time that allow to access.

If not set schedule, it is required to login on the web in order to allow the access.

The time is GMT+1 time zone

5. It will generate URL endpoints, username and password. Click on generate password and save it.

Please note that each password will be expired every 6 months and the interface is required to access at least every 17 days, otherwise, the connection will be deactivated.

How to change the access time

1. Login https://solvay.mckinseywave.com/login/auth

2. Go to Data Connections same as create connection

3. Right click on the connection that you want to change and select Update

4. Change date/time and click UPDATE

5. In case of the connector is deactivated, just mark it active here and keep login to the Wave in order to avoid deactivate.

How to reset password

1. Login https://solvay.mckinseywave.com/login/auth

2. Go to Data Connections same as create connection

3. Right click on the connection that you want to change and select Connect

4. Click on reset password

Talend

The Talend project is on DATA_OCEAN_DOMAIN_INDUSTRIAL

The project will required 4 sheets from the file output from the API. Then, in order to improve the performance of the loading and the size of data, the impactStar and SustainabilityImpactStar sheet are normalize into 2 tables

Detail job

J001_WAVE_extraiction_to_GCS

tJava to check l_VAR_WAVE_reload, true will get the file wave.xlsx, false will get from Wave API
tHttpRequest to call Wave api in case l_VAR_WAVE_reload = false
1. if return code = 200, the api will generate the file wave.xlsx
2. if return code != 200, the the email will be sent to inform DataOps
tSetGlobalVar to set current date for the snapshot value of this loading
tFileInputExcel4 to read wave.xlsx on projectStar Sheet, then transform with tMap and save file to "WAV_IND_0000_0000_F001_[yyyyMMddHHmmss]_0000_F_D_project_star.csv"
~~tFileInputExcel1 to read wave.xlsx on activityStar Sheet, then transform with tMap and save file to "WAV_IND_0000_0000_F002_[yyyyMMddHHmmss]_0000_F_D_activity_star.csv"~~
tFileInputExcel2 to read wave.xlsx on sustainabilityImpactStar Sheet, then spit into 2 files
1. month : unpivot by changing column (period) to row and remove the line which does not have data blank or 0 and save to "WAV_IND_0000_0000_F003_[yyyyMMddHHmmss]_0000_F_D_sust_impact_star_month.csv"
2. annual : keep all columns except the Purpose and period and aggregate the row to reduce duplicate rows and save file to "WAV_IND_0000_0000_F003_[yyyyMMddHHmmss]_0000_F_D_sust_impact_star_annaul.csv"
tFileInputExcel3 to read wave.xlsx on impactStar Sheet, then spit into 2 files
1. month : unpivot by changing column (period) to row and remove the line which does not have data blank or 0 and save to "WAV_IND_0000_0000_F004_[yyyyMMddHHmmss]_0000_F_D_impact_star_month.csv"
2. annual : keep all columns except the Purpose and period and aggregate the row to reduce duplicate rows and save file to "WAV_IND_0000_0000_F004_[yyyyMMddHHmmss]_0000_F_D_impact_star_annaul.csv"
Upload all 6 files to GCS
Delete the 6 files from Talend server

Flow job

F001_WAVE_TO_BQ

Setup meta_run_id and filename of the output file
Call the detail job and pass parameters
Call the standard job to upload the projectStar files from GCS to ODS
~~Call the standard job to upload the activityStar files from GCS to ODS~~
Call the standard job to upload the sustainabilityImpactStar annual files from GCS to ODS
Call the standard job to upload the sustainabilityImpactStar month files from GCS to ODS
Call the standard job to upload the impactStar annual files from GCS to ODS
Call the standard job to upload the impactStar month files from GCS to ODS
Update the log.

F011_DIM_wave_project_star_SCD1

Select only last loading in ODS update to DIM_wave_project_star

Access rights

It is required to get access from WAVE admin ednamaria.coan@solvay.com

Source

https://solvay.mckinseywave.com/login/auth → prj-data-dm-industrial-[env]
The API URL will be configured on Wave connection and the URL will keep in parameter l_VAR_WAVE_url_api

Format

columnar format (Excel - .xlsx)

Destination

Data Ocean

Bucket = cs-ew1-prj-data-dm-industrial-[env]-staging
DataOean GCP = prj-data-dm-industrial-[env]
STG Table name = prj-data-dm-industrial-[env].STG.
- STG_WAV_0000_0000_F001_F_D_project_star
- ~~STG_WAV_0000_0000_F002_F_D_activity_star~~
- STG_WAV_0000_0000_F003_F_D_sust_impact_star_annual
- STG_WAV_0000_0000_F003_F_D_sust_impact_star_month
- STG_WAV_0000_0000_F004_F_D_impact_star_annual
- STG_WAV_0000_0000_F004_F_D_impact_star_month
ODS Table name = prj-data-dm-industrial-[env].ODS
- ODS_WAV_0000_F001_F_D_project_star
- ~~ODS_WAV_0000_F002_F_D_activity_star~~
- ODS_WAV_0000_F003_F_D_sust_impact_star_annual
- ODS_WAV_0000_F003_F_D_sust_impact_star_month
- ODS_WAV_0000_F004_F_D_impact_star_annual
- ODS_WAV_0000_F004_F_D_impact_star_month
DM Tables name = prj-data-dm-industrial-[env].DM
- DIM_wave_project_star
- ~~DIM_wave_activity_star~~

Product

Bucket = cs-ew1-prj-data-industrial-dash-[env]-staging
Product GCP = prj-data-industrial-dash-[dev]
Dataset = prj-data-industrial-dash-dev.DataOcean → DPL
- ~~V_ODS_wave_activity_star_fl~~
- V_ODS_wave_impact_star_annual_fl
- V_ODS_wave_impact_star_month_fl
- V_ODS_wave_project_star_fl
- ~~V_wave_activity_star~~
- V_wave_impact_star_annual
- V_wave_impact_star_month
- V_wave_project_star
- V_wave_sust_impact_star_annual
- V_wave_sust_impact_star_month

Format

columnar format

Sizing

ProjectStar around 7000 records

Sust ImpactStart annual around 9000 records

Sust ImpactStart month around 54000 records

ImpactStart annual around 85,000 records

ImpactStart month around 1,226,000 records

Assessment

Data should be the same as output file and

SELECT distinct snapshot FROM `prj-data-dm-industrial-dev.ODS.ODS_WAV_0000_F004_F_D_impact_star_annual`
order by snapshot desc

The current date of snapshot should be available

Loading

1.1 Incremental Load

Source to ODS

N/A Always full load from the file

ODS to DM

l_VAR_dim_wave_project_star_reload = incremental

1.2 Full load

Set l_VAR_WAVE_reload = false

PL_INDUS_WAVE will load the output excel file from Wave API and save the current time as snapshot field

1.3. Reloading data

Source to ODS

Set l_VAR_WAVE_reload = true (in case of load historical data, manual file, which is not get from Wave API)

Key users will provide the historical file and load the file manually by rename the file to wave.xlsx on folder \\Acew1dtlndeng02\data\DEV\DATA_OCEAN_DOMAIN_INDUSTRIAL\InOut\WAVE. The file must have the same sequence of column same as the normal extraction.

ODS to DM

l_VAR_dim_wave_project_star_reload = QUALIFY ROW_NUMBER() OVER (PARTITION BY snapshot, projectStar ORDER BY meta_ods_insert_date DESC) = 1

1.4 Plan to schedule

Weekday at 04:01 AM CET

1.5 Timing

The average time expected for loading: around 20 mins

Criticality

High/Medium/Low

Logging

On project prj-data-dm-industrial-[environment]

select job.job_name, job.meta_start_date, job.meta_execution_id, logs.meta_run_id, logs.meta_source_system, logs.meta_step, logs.meta_status, logs.meta_num_lines, logs.meta_error_lines from STG.log_tables logs join STG.run_jobs job on logs.meta_run_id = job.meta_run_id
where logs.meta_run_id in (SELECT meta_run_id FROM STG.run_jobs order by meta_start_date desc limit 100)
and job_name like '%WAVE%'
and meta_start_date > DATE_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR)
order by job.meta_start_date desc

Page tree

Products

Services

Ways of Working

Meet the Team

Technologies

Help

Life in DA&AI

Description

Wave Connection

How to create connection

How to change the access time

How to reset password

Talend

Detail job

Flow job

Access rights

Source

Format

Destination

Data Ocean

Product

Format

Sizing

Assessment

Loading

1.1 Incremental Load

Source to ODS

ODS to DM

1.2 Full load

1.3. Reloading data

Source to ODS

ODS to DM

1.4 Plan to schedule

1.5 Timing

Criticality

Logging