...
Technical documentation for the data coming from tests on the equipment Drying
...
drying tests
Summary
| Table of Contents | ||||||
|---|---|---|---|---|---|---|
|
Sum-up
| Equipment / Scale | Tesla | Gunsan | Colognes 2500L | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Data Sources | ELN, Raw Data on Google drive | ELN, Raw Data on file share | ELN, Raw Data on Google Drive | |||||||||||||
| Raw Data File type | CSV | xlsx | xls | |||||||||||||
| Scale Name on ELN | FR-170L-TESLA | KR-170L | FR-2500L | |||||||||||||
| Data Collection | Talend: R011_Download_Synthesis_gDrive_Drying | Talend: J010_Download_Synthesis_LabServers Python : download_drying_gunsan.py (to be finished) | Talend: R011J011_Download_Synthesis_gDrive_Drying | |||||||||||||
| Parse | Python: parse_drying_tesla.py | Python: parse_drying_gunsan.py | Python: parse_drying_2500L.py | |||||||||||||
| Compute | Python: compute_drying_tesla.py | Python: compute_drying_gunsan.py | Python: compute_drying_2500L.py | |||||||||||||
| BigQuery | Tables cibles Target tables:
| |||||||||||||||
| Mapping spreadsheet |
| |||||||||||||||
Data Sources
- ELN
- Raw Data on Google Drive
Data Collection
The talend jobs J010_Download_Synthesis_LabServers and J011_Download_Synthesis_gDrive_Drying extract the raw data files listed on the ELN table drying_raw_data_link for which the field “drying_equipment_name” is the scale name, i.e. “FR-170L-TESLA”. For information of how these job works, check the following page :
Talend - Jobs - Synthesis - Download - Drying (needs to be created)
Schema using Google Drive
| Include Page | ||||
|---|---|---|---|---|
|
Schema with file share
| Include Page | ||||
|---|---|---|---|---|
|
...
| Info |
|---|
| Please refers to the DFS : TD - Synthesis - Norms and Conventions for the output filename convention on the Data Collection section |
Data Preparation
Parse
The parsing python scriptsextracts from the raw data files the needed columns.
Include Page DFS - TD - Data Preparation - Parsing - Schema 01 DFS - TD - Data Preparation - Parsing - Schema 01
Columns List
For each sample, the script extracts the many fields from the raw data files and outputs a .csv file. For the mapping details, please refers to the sheet "Parse Mapping " on the Drying Mapping spreadsheet (link to the spreadsheet on the Sum-up section).
Compute
The compute python script uses as input the parsed .csv files previously created . It computes the new columns and values from raw data and regenerates new files.
...
For each sample, it creates two different files that will be used to create new tables on BigQuery :
DryingDetails
The first table is composed of the columns previously extracted from the raw data files and the new columns calculated during the execution.
Dataset : raw_data_synthesis_mig
For the columns details, please refers to the sheets " Details Mappings " on the DRYERS Drying Mapping spreadsheet (link to the spreadsheet on the Sum-up section).
...
Dataset : raw_data_synthesis_mig
For the columns details, please refers to the sheets " Summary Mapping " on the DRYERS Drying Mapping spreadsheet (link to the spreadsheet on the Sum-up section).
Presentation
The details and summary files are created as tables on BigQuery unifying all scales in the same tables. A Talend job is responsable to push all this data to a dataset called raw_data_synthesis_mig.
Include Page DFS - TD - Data Presentation - Upload to BQ - Schema 01 DFS - TD - Data Presentation - Upload to BQ - Schema 01
Visualization
Refer to Tableau documentation