You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 10 Next »

This page presents the data development documentation for Battery Conductivity for all sites.

This page was copied from Battery. Needs to be updated


The standard ETL areas and processes are described on LB Data Dev Architecture - General.

Summary

Data flow diagram

All the schemas are available on google drive for further edition:

Data Ingestion

Data Sources

ELN

Simplified Flow

ELN API → <Extraction> →  Local JSON File

Spreadsheets

SpreadsheetDescriptionData Status*
ELN Conductivity MeasurementThis is a standard ELN spreadsheet for all sites

HOT

ELN Battery Experiment PropertiesThis is a standard ELN spreadsheet for all sites

HOT

*Data Status: HOT: Data currently being updated at the source. It should be loaded regularly. COLD: No data changes at the source. It should be loaded just once.

The list of the spreadsheets/tables extracted from JSON files coming from ELN can be found in the Data Mapping of the next section.

Related documents

Document NameLink
Battery - ELN Data Model

Battery - ELN Template Documentation

Instruments

Simplified Flow

Lab server → Google Share Drive → <Copy> →  Local XML/TXT file

Files

The instrument files are, most of them, manually added to a Google Drive Shared folder

SiteDescriptionStatus
Seoul

HOT

Brussels

HOT

Bollate

HOT

Aubervilliers

HOT

*Data Status: HOT: Data currently being updated at the source. It should be loaded regularly. COLD: No data changes at the source. It should be loaded just once.

Data Preparation or Parsing

Simplified Flow

Local XML/TXT file → <Load> → Cloud Storage/Staging BigQuery

Data Mapping

The spreadsheet below presents all data transformation between the raw files (extracted files) and a BigQuery delta table. Some files are unstructured and semi-structured. This steps aims to structure the files in the target table format and checking if the column's type (schema) are conformed expected.

Data Integration or Computing

The data integration phase for batteries follows the standard approach describe on LB Data Dev Architecture - General.

Simplified Flow

Staging BigQuery → <Transform> → ODS BigQuery 

Data Mapping

The spreadsheet below presents all data transformations between the tables on Staging and ODS. This steps aims to add some calculations and intelligence to the data. Not all tables will need passing through this step.

Raw Data Pairing (all sites)

This spreadsheet shows how to align all cyclers in just one commun table schema: 

Data Presentation (DW/DM)

Simplified Flow

ODS BigQuery → <Transform> → DW BigQuery → <Expose> → DM_Battery_Electrochemistry BigQuery  

Data Mapping

Data Model

The following data model presents the tables present on DW dataset and the relation between them: 


For the DM dataset (DM_Conductivity), we have the views on the top of the tables before. As defined in the convention, there is no need of the abbreviation "conduct". No need either of Talend jobs.

Orchestrating Jobs

All the jobs are run in sequence under the follow job and project name on TAC/Talend Cloud:

ProjectJob/Flow
RnI_ACN_BatteryF010_RnI_ACN_Battery_ELN_IDBS_Orch_Flow
RnI_ACN_BatteryF020_RnI_ACN_Battery_ELN_Integration_Orch_Flow
RnI_ACN_BatteryF011_RnI_ACN_Battery_Instruments_Orch_Flow
RnI_ACN_BatteryF021_RnI_ACN_Battery_Instr_Integration_Orch_Flow

For scheduling details check the Operational documentation.

Talend

ELN

Instruments

Big Query

Tables (Staging)

Views(ODS)

Data Visualization


Conductivity (On going):   


  • No labels