...
| Info |
|---|
The standard ETL areas and processes are described on LB ALB Data Dev Architecture - General. |
Summary
Data flow diagram

*No Labware for Batteries Conductivity
| Info |
|---|
All the schemas are available on google drive for further edition: | Google Drive Live Link |
|---|
| url | https://drive.google.com/file/d/1E5QOeBlfR7uzWXH7rx_UoQFZmXE_4jsq/view?usp=sharing |
|---|
|
|
Data Ingestion
Data Sources
ELN
| Info |
|---|
|
ELN API → <Extraction> → Local JSON File |
Spreadsheets
| Spreadsheet | Description | Data Status* |
|---|
| ELN Conductivity Measurement | This is a standard ELN spreadsheet for all sites | | Status |
|---|
| |
|---|
| subtle | true |
|---|
| colour | Green |
|---|
| title | HOT |
|---|
|
|
| ELN Battery Experiment Properties | This is a standard ELN spreadsheet for all sites | | Status |
|---|
| |
|---|
| subtle | true |
|---|
| colour | Green |
|---|
| title | HOT |
|---|
|
|
...
The list of the spreadsheets/tables extracted from JSON files coming from ELN can be found in the Data Mapping of the next section.
| Document Name | Link |
|---|
| Battery - ELN Data Model |
| Google Drive Live Link |
|---|
| url | https://drive.google.com/file/d/1sD6OqKnBzSR_SrGvzlhEl7s5F5vUGQqD/view?usp=sharing |
|---|
|
|
| Battery - ELN Template Documentation |
| Google Drive Live Link |
|---|
| url | https://docs.google.com/document/d/1myJ7zU4cTW1LG6eAj8z5rcKpMqhPOejl/edit#heading=h.gjdgxs |
|---|
|
|
Instruments
| Info |
|---|
|
Lab server → Google Share Drive → <Copy> → Local XML/TXT file |
Files
The instrument files are, most of them, manually added to a Google Drive Shared folder
...
*Data Status: HOT: Data currently being updated at the source. It should be loaded regularly. COLD: No data changes at the source. It should be loaded just once.
Data Preparation or Parsing
| Info |
|---|
|
Local XML/TXT file → <Load> → Cloud Storage/Staging BigQuery |
Data Mapping Source => Staging (Talend)
The spreadsheet below presents all data transformation between the raw files (extracted files) and a BigQuery Staging table.
| Google Drive Live Link |
|---|
| url | https://docs.google.com/spreadsheets/d/1xjxpQ6nfeRDM060MzAeB_ZzhZBvp1rHRcYk7tYK0rGk/edit?usp=sharing |
|---|
|
Data Integration or Computing
The data integration phase for batteries follows the standard approach describe on LB ALB Data Dev Architecture - General.
| Info |
|---|
|
Staging BigQuery → <Transform> → ODS BigQuery |
Data Mapping Staging => ODS (BigQuery SQL views)
The spreadsheet below presents all data transformations from Staging tables to ODS tables. This steps aims to structure the files in the target table format and checking if the column's type (schema) are conformed.
| Google Drive Live Link |
|---|
| url | https://docs.google.com/spreadsheets/d/1xcwLXJKykko5w7WXLmqjFyygG-yC6t6-t7Z6MkbIfvE/edit?usp=sharing |
|---|
|
Data Model
The following data model presents the tables present on ODS dataset and the relation between them:
| Google Drive Live Link |
|---|
| url | https://drive.google.com/file/d/1sD6OqKnBzSR_SrGvzlhEl7s5F5vUGQqD/view?usp=sharing |
|---|
|
Image Added
Image Added
Data Presentation (DW/DM)
| Info |
|---|
|
ODS BigQuery → <Transform> → DW BigQuery → <Expose> → DM_Battery_Electrochemistry Conductivity BigQuery |
Data Mapping ODS => DW/DM (BigQuery SQL views)
The spreadsheet below presents all data transformations between the tables on ODS and DM_Conductivity. This steps aims to create views for the Data Visualization.
| Google Drive Live Link |
|---|
| url | https://docs.google.com/spreadsheets/d/1XL49eWOdv9tZMcrnU5r7SQGcidEQ4YLj-wBMpzT6Xxo/edit?usp=sharing |
|---|
|
Data Model
The following data model presents the tables present on DW/DM dataset and the relation between them:
| Google Drive Live Link |
|---|
| url | https://drive.google.com/file/d/1sD6OqKnBzSR_SrGvzlhEl7s5F5vUGQqD1X5QpTXYvIU5o6PM5UcXIolT51M319mZP/view?usp=sharing |
|---|
|
...
Image Added
For the DM dataset (DM_Conductivity), we have the views on the top of the tables before. As defined in the convention, there is no need of the abbreviation "conduct". No need either of Talend jobs.
Orchestrating Jobs
All the jobs are run in sequence under the follow job and project name on Talend Cloud:
...
The following jobs should not be orchestrated and only run once during the deployment :
| Project | Job/Flow | Associated TMC Plan |
|---|
| RnI_ACN_Battery | F010_RnI_ACN_Battery_Create_BQ_Views | PL_RNI_ACN_BATTERY_CONDUCTIVITY_CREATE_VIEWS |
Talend
ELN


Instruments


Big Query
Tables (Staging)

Views(ODS)

Views(DM_Conductivity)
Image Removed
Image Added
Data Visualization
Conductivity (On goingGCP and Tableau documentation) :
| Google Drive Live Link |
|---|
| url | https://docs.google.com/spreadsheets/d/1294LMp-xHVA590kVm7rbA-8W-Crvnho2mJ8BfU_zqOQ/edit#gid=1728102916 |
|---|
|
...