Specific technical documentation for the Application Lab tests

Summary

Data sources

R aw data files for all labs are extracted from the servers presented in the following spreadsheet: Asset Tag Spreadsheet

The table below presents all the Application tests that are currently included in the Datalake, the source of data and the raw data files format for each test:

Test

Data Sources

File format

ODR (Oscillating Disc Rheometer)

Raw data files 

ELN data

.exp

.xml

DSAS (Dynamic Strain Amplitude Sweep)

Raw data files 

ELN data

.csv

.xml

DSAS_U (Dynamic Strain Amplitude Sweep Uncured)

Programs : P05, P28 and P38

Raw data files 

ELN data

.csv

.xml

Tensile Properties

Raw data files 

ELN data

.csv

.xml

Temperature Sweep

Raw data files 

ELN data

.asc, .csv

.xml

DIN Abrasion Resistance

ELN data

.xml

Filler Dispersion

ELN data

.xml

Hardness Shore A

ELN data

.xml

Lambourn

ELN data

.xml

Mixing Process

ELN data

.xml

Mooney Viscosity

ELN data

.xml

Abrasion Loss and Density

ELN data

.xml

Rebound Resilience

ELN data

.xml

Tear Resistance

ELN data

.xml

DSA (Dynamic Strain Amplitude)

ELN data

.xml

Talend jobs

Data is processed using Talend and Python. Talend jobs regarding Application Lab can be found in the Talend project RnI_Silica_Application .

Job F000_Orch_Flow_Application is the orchestrator and will call successively all the other jobs a determined order.

Python scripts

A copy of the Python scripts for downloading, parsing and computing new fields and columns for the Application Lab raw data and ELN data can be found here:

However it's recommended to get the last version of the scripts available in the production path: D:\DATA\PROD\RnI\Silica\Permanent. For the moment, git is not in place for this project. 

ELN Data jobs

The Po2 project will need to change only the defined folders in the file above in order to get the right ELN data



Raw Data jobs

For the tests for which raw data files are available (i.e., DSAS, DSAS_U, ODR, Tensile Properties, Temperature Sweep), the table raw_data_mapping previously extracted from the ELN .xml files contains, for each test_id, the link to the raw data file that needs to be extracted. For each test, the raw data file extraction is made via the Python scripts incorporated in this job (R001):


In order to ensure a maximum of flexibility for the user, raw data files are extracted and processed each day. Therefore, the history of raw data files is not stored from one day to the next, every raw data file is extracted every day.

One advantage of this approach is that the user has the possibility to modify or to change the raw data files and to see these changements reflected in the datalake. Nevertheless, this implies that the link to the raw data file written in ELN must always be up to date and it needs to allow the extraction of the file. This condition is fulfilled if (1) the lab server and (2) the arborescence leading to the raw data file do not change.

It is possible that lab servers change due to the replacement of the lab machines. As mentioned above, this has consequences on the extraction of the raw data files because the link to the raw data file written in ELN becomes obsolete. The python scripts that extract the raw data files are able to adapt and to handle the changes of the lab machines without the intervention of the user (No need to modify the link to the file that was written in ELN). Nevertheless, this solution is viable only if the following conditions are fulfilled: 

  1. The name of the root folder containing the raw data needs to stay the same (i.e. Data lake, or any other name that was initially established for each test).
  2. From this level of the arborescence (Data lake\...), the path to the raw data file needs to stay unchanged
  3. Every change of lab machine needs to be reported as soon as possible to our team. The new name of the server needs to be updated in the variable contexts of the Talend project:


Application Lab Tests

Curing Properties – Oscillating Disc Rheometer (ODR)

The Python script download_CP_ODR.py extracts the raw data files listed in the ELN table raw_data_mapping from the lab server.

For each raw data file, the Python script parse_CP-ODR.py creates a .csv file containing the following columns extracted from the raw data file:

For each file generated in the previous step, the python script compute_CP-ODR.py creates a .csv file containing the following values computed from raw data:

Minimum of “s_prim”

Maximum of “s_prim”

Last value of “s_prim”

max_torque – min_torque

Value of “time_min” corresponding to min_torque + 2

Value of “time_min” corresponding to delta_torque * 90% + min_torque

Value of “time_min” corresponding to delta_torque * 98% + min_torque

T98 * 1.5

Compute the slope of “s_prim” starting with time T90.

reversion_index = slope if slope < 0

reversion_index = 0 if slope > 0

delta_torque / min_torque

Value of “speed” corresponding to T50

Dynamical Properties – Dynamic Strain Amplitude Sweep (DSAS)

The Python script download_DP_DSA_Sweep.py extracts the raw data files listed in the ELN table raw_data_mapping from the lab server.

For each raw data file, the Python script parse_DP_DSA_Sweep creates a .csv file with the following columns extracted from the raw data files:

For each test_id, the Python script compute_DP_DSA_Sweep.py creates a .csv file with the following values computed from raw data:

First value of “g_prim”

Last value of “g_prim”

Value of “g_prim” corresponding to “dsa” = 0.5

g_prime_50 - g_prime_0_second

Maximum of “g_second” for the descent

Value of “dsa” corresponding to g_second_max

Maximum of “tan_delta” for the descent

Value of “tan_d” at 10% of “dsa” for the descent

Value of “tan_d” at 0.1% of “dsa” for the descent

Value of “g_star” at 12% of “dsa” for the descent

Dynamical Properties – Temperature Sweep

The Python script download_Temp_Sweep.py extracts the raw data files listed in the ELN table raw_data_mapping from the lab server.

For each test_id, the Python script parse_DP_Temp_Sweep.py creates a .csv file with the following columns extracted from the raw data files:


For each test_id, the Python script parse_DP_Temp_Sweep.py creates a .csv file with the following values computed from raw data:

Maximum of “tan_delta”

Value of “temperature” corresponding to max_tan_delta

Maximum of “e_second”

Value of “temperature” corresponding to max_esec

Linear regression to infer the value of “e_star” corresponding to “temperature” = -70

Linear regression to infer the value of “e_star” corresponding to “temperature” = -45

Linear regression to infer the value of “e_star” corresponding to “temperature” = -25

Linear regression to infer the value of “tan_delta” corresponding to “temperature” = -25

Linear regression to infer the value of “tan_delta” corresponding to “temperature” = 0

Linear regression to infer the value of “e_prime” corresponding to “temperature” = 0

Linear regression to infer the value of “e_second” corresponding to “temperature” = 0

Linear regression to infer the value of “tan_delta” corresponding to “temperature” = 40

Linear regression to infer the value of “tan_delta” corresponding to “temperature” = 60

Linear regression to infer the value of “tan_delta” corresponding to “temperature” = 80

Linear regression to infer the value of “e_star” corresponding to “temperature” = 60

Linear regression to infer the value of “e_prime” corresponding to “temperature” = 40

tan_delta_100 / tan_delta_80

estar_min_100 / estar_min_80

Green Properties – Dynamic Strain Amplitude Sweep Uncured (DSAS_U)

The Python script download_GP_DSAS_U.py extracts the raw data files listed in the ELN table raw_data_mapping from the lab server.

Raw data files for DSAS_U are extracted only for the P38 program. The python scripts were designed to handle P5 and P28 programs also, but these parts of the code need to be deleted once we will be sure that they are no longer needed.

For each raw data file, the Python script parse_GP_DSAS_U.py creates a .csv file with the following columns extracted from raw data files:


For each test_id, the Python script compute_GP_DSAS_U (P05, P28, P38) creates a .csv file with the following values computed from raw data:

Value of “g_prim” corresponding to the value of “set_strain_perc” that is closest to 0.91%

Value of “g_prim” corresponding to the value of “set_strain_perc” that is closest to 14%

Value of “g_prim” corresponding to the value of “set_strain_perc” that is closest to 50%

g_prim_091 - g_prim_50

Value of “tan_delta” corresponding to the value of “set_strain_perc” that is closest to 0.91%

Value of “tan_delta” corresponding to the value of “set_strain_perc” that is closest to 50%

Value of “g_sec” corresponding to the value of “set_strain_perc” that is closest to 0.91%

Value of “g_sec” corresponding to the value of “set_strain_perc” that is closest to 14%

Value of “g_sec” corresponding to the value of “set_strain_perc” that is closest to 50%

g_sec_091 - g_sec_50

Physical Properties – Tensile Properties

From ELN data, the Python script download_TensileProperties.py extracts de links to the raw data files that need to be extracted from the server and downloads these files.

Raw data files are in .csv format and they are organized by study ID. The raw data files corresponding to one study ID should be put in a folder named with the sample ID only.

Each folder corresponding to one study ID contains one or several sub-folders. Each sub-folder contains raw data files corresponding to one single test_id and it’s name should be composed by the test_id followed by the string “.is_tens_RawData” (ex., the sub-folder “19cc355-01.is_tens_RawData”). 

Along with the sub-folders, the folder corresponding to the study_id contains one .csv file corresponding to each sub-folder (ex., for the sub-folder “19cc355-01.is_tens_RawData”, we will have the .csv file “19cc355.is_tens.csv”). These .csv files contain the list of specimens for which data will be extracted.

For each test_id and for each specimen, the Python script parse_TensileProperties.py creates a .csv file with the following columns extracted or computed from raw data files :



(deformation_traction * constraint_traction) / 100 

For each test_id, the average specimen is also computed.

For the average specimen of each test_id, the following columns are computed:

Average of secant_modulus of all specimens / “abcissa” * 100

Average of secant_modulus of all specimens * (“abcissa” / 100 + 1)

“J” / “abcissa” * 100


The Python script compute_TensileProperties.py creates a .csv file with the following computed values for each test_id and for each specimen:

Value of “constrant_traction” corresponding to “abcissa” = 10

Value of “constrant_traction” corresponding to “abcissa” = 100

Value of “constrant_traction” corresponding to “abcissa” = 200

Value of “constrant_traction” corresponding to “abcissa” = 300

m10 / m100

m200 / m100

m300 / m100

Maximum of “constraint_traction”

Value of abcissa corresponding to ts

ts * eb