You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

5.1 - Talend Integration

Source data integration with Talend ETL tool

  • FTP Server (Meteologica):

    • Talend connects to the FTP server where files containing current and future electricity prices for Spain, Italy, France, and Germany are located.
    • Talend retrieves these files from the FTP server.
    • The retrieved files are then loaded into Google Cloud Storage.
  • Postgres Database (Vendohm):

    • Talend establishes a direct connection with the Vendohm PostgreSQL database that stores sensor values identified by specific curve IDs.
    • It extracts relevant data from the PostgreSQL database.
    • This extracted data is loaded into Google Cloud Storage as files.
  • Google Sheets (Hedges, Marginal cost, CO2):

    • Talend integrates with Google Sheets, where marginal costs, CO2 costs, and hedging information are stored.
    • It retrieves this data from Google Sheets.
    • Similar to the other sources, this data is also loaded into Google Cloud Storage as files.

Data Transformation and Loading to Google BigQuery:

  • Once data from all three sources (FTP server, PostgreSQL database, Google Sheets) is available in Google Cloud Storage as files, Talend proceeds with data transformation and loading.
  • Talend performs data transformations as needed, including cleansing, mapping, and structuring the data for consistency.
  • The transformed data is loaded into various stages, operational data stores (ODS), and data mart tables within Google BigQuery.
  • These tables are organized to facilitate efficient querying and reporting for energy optimization purposes.

By utilizing Talend for data extraction, transformation, and loading (ETL), the web app ensures that data from diverse sources is collected, processed, and structured for analysis and reporting within Google BigQuery, enabling users to make informed decisions based on up-to-date and accurate data.

5.2 - Source Data Extraction


5.3 - Data Transformation


5.4 - Data Loading to Google BigQuery


5.5 - Error Handling


5.6 - Scheduling and Automation


5.7 - Performance Optimization


5.8 - Data Validation


5.9 - Dependencies and Order of Operations


5.10 - Versioning and Change Management


Responsible & contact points:

  • Alessandro Mainardi - Project Owner
  • Simon Bourguignon - Delivery Manager
  • Alba Carrero/ Gaetan Frenoy - Product Owner
  • Rui Ferraz - Project Manager


  • No labels