- Created by Menghetti, Mattéo, last modified on Jan 21, 2024
You are viewing an old version of this page. View the current version.
Compare with Current View Page History
« Previous Version 11 Next »
4.1 - Talend Integration
Source data integration with Talend ETL tool
Xtract Server (WBP):
- Talend connects to the Xtract server and trigger extraction job using HTTP calls.
- Xtract generates results as CSV files stored in memory.
- Talend retrieves files, apply some transformation and then loads them into Google Cloud Storage.
Google Sheets (Hedges, wap solid fuels, CO2):
- Talend integrates with Google Sheets, where solid fuel wap, CO2 emissions, and hedging information are stored.
- It retrieves this data from Google Sheets.
- Similar to the other sources, this data is also loaded into Google Cloud Storage as files.
Data Transformation and Loading to Google BigQuery:
- Once data from all four sources (FTP server, PostgreSQL database, Oracle database, Google Sheets) is available in Google Cloud Storage as files,
Talend proceeds with data transformation and loading. - Talend performs data transformations as needed, including cleansing, mapping, and structuring the data for consistency.
- The transformed data is loaded into various stages, operational data stores (ODS), and data mart tables within Google BigQuery.
- These tables are organized to facilitate efficient querying and reporting for energy optimization purposes.
By utilizing Talend for data extraction, transformation, and loading (ETL), the web app ensures that data from diverse sources is collected, processed, and structured for analysis and reporting within Google BigQuery, enabling users to make informed decisions based on up-to-date and accurate data.
4.2 - Source Data Extraction
| Main jobs for source extraction |
| --to the top ↑-- |
|---|---|---|
| Job description by steps | Job design | |
We will analyze in detail on job but the logic in the other is simimar |
| |
|
|
| Main jobs for loading Staging |
| --to the top ↑-- |
|---|---|---|
| Job description by steps | Job design | |
|
|
| Main jobs for loading the ODS |
| --to the top ↑-- |
|---|---|---|
| Job description by steps | Job design | |
| ![]() |
| Main jobs for building DM |
| --to the top ↑-- |
|---|---|---|
| Job description by steps | Job design | |
|
|
| Prepare DM file | F400_PURCHASING_STEP | --to the top ↑-- |
|---|---|---|
| Job description by steps | Job design | |
This job builds the first pieces of data necessary to build the FUNCTION cost target |
|
| Main jobs for source extraction |
| --to the top ↑-- |
|---|---|---|
| Job description by steps | Job design | |
The job reads in sequence a list of ODS tables and apply the requested mappings
This Talend job is the second step of building Dimension and Fact table. All data in FACT tables are also assigned to ECO or SCO. |
|
| Main jobs for source extraction |
| --to the top ↑-- |
|---|---|---|
| Job description by steps | Job design | |
The job reads in sequence a list of ODS tables and apply the requested mappings
|
|
4.4 - Load to DM (calculations and transformations)
| Main jobs for source extraction |
| --to the top ↑-- |
|---|---|---|
| Job description by steps | Job design | |
This jobs reads the files stored in remote engine and computed in the previous and use them to populate dimensions and fact tables. |
|
| Main jobs for source extraction |
| --to the top ↑-- |
|---|---|---|
| Job description by steps | Job design | |
Metadata Calculation:
Slowly Changing Dimension (SCD2) Data Loading:
Logging and Cleanup:
|
|
| Main jobs for source extraction |
| --to the top ↑-- |
|---|---|---|
| Job description by steps | Job design | |
Metadata Calculation:
Data Enrichment and Loading:
Logging and Cleanup:
|
|
| Main jobs for source extraction |
| --to the top ↑-- |
|---|---|---|
| Job description by steps | Job design | |
Metadata Calculation:
Data Enrichment and Loading:
Logging and Cleanup:
|
|
4.5 - Scheduling and Automation
TMC - PL_QV_TO_TABLEAU_PURCHASING - Daily at 5 AM.
4.6 - Data Validation
Responsible & contact points:
- Alessandro Mainardi - Project Owner
- Simon Bourguignon - Delivery Manager
- Alba Carrero/ Gaetan Frenoy - Product Owner
- Rui Ferraz - Project Manager
- No labels











