A Data Model represents the way data is structured in a set of data dataset or a database, such as in Lab Booster’s data ocean.
The data model defines how the data lake or data ocean is connected to:
- The data input i.e. the ELNELN, LIMS systems, connected instruments etc.
- The data output i.e. DataLab the WebApp DataLab in which users can access data
Context
As of mid-2023, each market in Lab Booster has its own data model i.e. its own way to structure data.
This means that we are forced to build the connections between ELN/data ocean/DataLab at each new project.At each new project, connections to the data lake must be built again
Objective
Our aim is to have a common data model for all markets so that the connections between ELN/data ocean/DataLab are built, once and for all. This will accelerate , to bring:
- Accelerated delivery of new projects
- Better performance
- Less maintenance
This page is divided two sections
- Entity-Relationship Diagram (ERD), which served as a basis to design the data model
- Data model
Entity-Relationship Diagram (ERD)
Data Models are generally based on a diagram or schema called Entity-Relationship Diagram defining
- Entities i.e. a definable object or concept within a system
- Relationships i.e. how entities are related to one another
Building the ERD is a preliminary step to designing the actual data model to ensure that all required entities and relationships are accurately defined and represented.
This section is split in three two parts
- Entities dictionary
- Relationship definition
- ERD representation
Entity dictionary
Emulsion Polymerization experiment
Solvent screening experiment
Formulation ABC =
1,2- Dibromoethane - 89%
+ Rhodacal 60 - 5%
+ Rhodasurf L-20 - 6%
Formulation ABC Batch xxxx =
1,2- Dibromoethane Batch xxxx Sigma Aldrich - 89.1%
+ Rhodacal 60 Batch xxxx - 5.2%
+ Rhodasurf L-20 Batch xxxx- 5.7%
A piece of the product/component on which the test is perfomed
The sample can be taken from:
- A raw material: a product/component received from a supplier
- A formulation batch: a combination of raw materials and/or synthesized products/components
- A synthesized product/component: a product/component resulting from the combination and transformation of raw materials and/or other synthesized products/component
Raw material: methanol from Sigma Aldrich
Formulation sample: combination of solvent, active ingredient and other additives
Synthesized product: latex
Observation/analysis of the behavior of a sample of a component/product when a procedure is carried out in a set of conditions :
- Sample preparation
- Sample analysis
- Sample performance in application
It has an SOP (Standard Operating Procedure)
Formulation
Conductivity
Paint application
Temperature/pressure at which experiment is carried out
Solute concentration
It can be a numerical value, a set of numerical values (i.e. curve) or a non numerical value (i.e. observations)
Shear stress vs. Shear rate
Pass/Fail
In the context of a lab activity it can be a condition and/or a result
- Entity-Relationship Diagram design
- ERD mapping with R&I workflows
Entity-Relationship Diagram design
Entity dictionary
| Include Page | ||||
|---|---|---|---|---|
|
Entity-Relationship Diagram
ERD mapping with R&I workflows (WIP)
Three types of R&I workflows were identified
- Formulation workflows
- Synthesis workflows
- Analysis workflows
This was done in order to ensure that the ERD defined accomodates all types of R&I workflows.
The mapping done for different workflows is summarized in the table below.
| GBU/F- R&I | Workflow name | Workflow type | Mapping status | Link to mapping | Documentation - Data capture |
|---|---|---|---|---|---|
| Novecare GBU | Seed Care Formulation | Formulation | Done | Seed Care mapping | ELN template |
| Novecare GBU | Seed Care Request & Results | Formulation | Done | Seed Care mapping | ELN template |
| Battery Platform | Mecanosynthesis | Synthesis | Done | Mecanosynthesis mapping | ELN template |
| Aroma Performance GBU | Fermentation | Synthesis | Done | Fermentation mapping | ELN spreadsheet mockup |
| BioMatTech Platform | Biodegradability | Analysis | Done | Biodegradability mapping | LIMS spreadsheet mockup |
| Specialty Polymers GBU | Aging, Mechanical, Thermal | Analysis | Ongoing | ||
| Specialty Polymers GBU | Synthesis | To do | |||
| Novecare GBU | Agro | Formulation | To do | ||
| Novecare GBU | EP Coatings | Synthesis | To do | ||
| Novecare GBU | Paint Coatings | Formulation | To do | ||
| Corporate R&I | Solvent platform - Solubilization | To do | |||
| Corporate R&I | Analysis | To do | |||
| Green Hydrogen Platform | Conductivity | Analysis | To do |
Relationships definition
One Experiment can have;
- One or more Samples;
- One or more Formulations;
A Formulation can have:
- One or more Ingredients;
- One or more Formulation Batches;
A Formulation Batch can have:
- One or more Ingredients;
A Lab Activity can:
- Have one or more Processes;
- One or more Samples;
- Be associated to only one Experiment;
A Sample can:
- only be part of a Formulation Batch;
- have one or more processes associated;
- be part of one or more processes;
A Process can:
- Have/use multiples Samples;
- Be a part of a group of processes used within a Sample;
A Formulation Batch can be associated to one or more Samples;
A Test Group can:
- Only be part of one Sample
- Have one or more Tests
A Test can:
- Only be part of one Test Group
- Have one or more Conditions
- Have one or more Result Series
A Condition can:
- Only be part of one Test;
- Have one or more Measures;
A Result Serie can:
- Only be part of one Test;
- Have one or more Results;
A Result can:
- Only be part of one Result Serie;
- Have one or more Measures
A Measure can be used both as a Condition as well as a Result;
Entity-Relationship Diagram
Legend
Entity
Primary key
Foreign key
Attribute
Relationships
- The ring represents "zero"
- The dash represents "one"
- The crow's foot represents "many"
Diagram
ERD mapping with R&I workflows
Distinguish
- Formulation workflows
- Synthesis workflows
- Analysis workflows
Ensure that data model fits for all
Formulation
Seed Care
Synthesis
Mecanosynthesis
Fermentation
Analysis
Biodeg
Data model
