Technical documentation for the Bollate raw data load and integration.

Summary

Sum-up


Macro Schema


All schemas can be found on the following GDrive link:

#133 Battery Data management platform >> 06 - Documentation >> Architecture >> Battery Data Scheme and Diagrams

Data sources

Raw data files have different structures and are extracted from different the servers:


CyclersDescriptionData StatusFlow Status

Maccor

CSV Raw files coming from Maccor machine. The source server is located in Italy

Biologic BCS

CSV Raw files coming from Biologic BCS machine. The source server is located in Italy

Biologic VMP

CSV Raw files coming from Biologic VMP machine. The source server is located in Italy

ArbinData coming from SQL Server database? for Arbin machine


Load and Integration

Schema

NDescription
AA Talend job is responsible for getting all files from the sources and bringing them to the Talend server to finally load the raw data to BigQuery. 
a1All new raw data is loaded in a "delta" table. A SQL script is responsible to consolidate the new data to the already existing. Every cycler has its own tables.

B

Once the data is consolidated, it's taken to another dataset where data is transformed, enriched and integrated on the same table with other Bollate's cycles.  
b1Data is transformed, enriched and integrated passing through many tables. There is one or many SQL scripts to do this task.  
CData coming from ELN is integrated at this point. Check ELN page for more details. 
DData is pushed to the enriched_data dataset where it can be accessed by the users.


All SQL scripts can be found:

  • Project Saved Queries
  • Git?
  • GCP Cloud Storage

Integration and Computing

Data mapping to have the same table structure for all the cyclers files:   

Raw Data Pairing

Talend Project

RnI_Battery_Bollate

This project holds all jobs necessary to load all Battery Bollate raw data

Talend jobs