Overview
A Data Model represents the way data is structured in a set of data or a database, such as in Lab Booster’s data ocean.
The data model defines how the data ocean is connected to:
- The data input i.e. the ELN
- The data output i.e. DataLab the WebApp in which users can access data
Context
As of mid-2023, each market in Lab Booster has its own data model i.e. its own way to structure data.
This means that we are forced to build the connections between ELN/data ocean/DataLab at each new project.
Objective
Our aim is to have a common data model for all markets so that the connections between ELN/data ocean/DataLab are built, once and for all.
This will accelerate delivery of new projects and ensure better performance.
This page is divided two sections
- Entity-Relationship Diagram (ERD), which served as a basis to design the data model
- Data model
Entity-Relationship Diagram (ERD)
Data Models are generally based on a diagram or schema called Entity-Relationship Diagram defining
- Entities i.e. a definable object or concept within a system
- Relationships i.e. how entities are related to one another
Building the ERD is a preliminary step to designing the actual data model to ensure that all required entities and relationships are accurately defined and represented.
This section is split in two parts
- Entity-Relationship Diagram design
- ERD mapping with R&I workflows
Entity-Relationship Diagram design
Entity dictionary
| Entity | Definition | Example(s) |
|---|---|---|
| Experiment | A recording of a set of Activities performed in the lab by an operator at a given date using specific chemical ingredients or samples to achieve an objective | Emulsion Polymerization experiment Solvent screening experiment |
| Solvay User | ||
| ELN entity path User Permissions | ||
| Activity | A group of Processes performed in the lab in a specific order. | Chemical synthesis is an Activity that encompasses certain Processes |
| Process | A Process is defined by:
Synonym: step | Within a chemical synthesis Activity, several Processes are carried out in the following order 1. First chemical reaction 2. Treatment by liquid-liquid extraction Purification 3. Second chemical reaction 4. Purification |
| Process Ingredient | A product, component, material, sample of formulation batch recorded in chemical inventory used as an input in a Process Synonym: reagent | Ferulic Acid is a Process Ingredient in the Biocinversion Process for Fermentation Activity |
| Process formulation | The Process Ingredients (generic) and their proportion (target) composing a formula that is used in a Process | |
| Process formulation batch | The Process Ingredients (chemical inventory) and their proportion (actual) composing a formula that is prepared in the lab and used in a Process | |
| Process Sample | A Sample taken from the Process End-Product(s) or from the Process while it is ongoing A Process Sample is constrained to one Process or Process End-Product, while several Process Samples can be taken from a Process or a Process End-Product. A Test or several Tests can be performed on it. | Samples taken during the Growth Process of the fermentation Activity |
| Process End-product | The output of a Process, characterized by its name, composition, aspect etc. An End-product is constrained to one Process, while there can be several End-Products for one Process. | Vanillin is an End-Product of the Process Bioconversion of the fermentation Activity |
| Formulation | The Ingredients (generic) and their proportion (target) composing a formula | Formulation ABC = 1,2- Dibromoethane - 89% + Rhodacal 60 - 5% + Rhodasurf L-20 - 6% |
| Formulation Batch | A formula prepared in the lab with the real Ingredients (chemical inventory) and their proportion (actual) | Formulation ABC Batch xxxx = 1,2- Dibromoethane Batch xxxx Sigma Aldrich - 89.1% + Rhodacal 60 Batch xxxx - 5.2% + Rhodasurf L-20 Batch xxxx- 5.7% |
| Ingredient | A product, component, material, sample of formulation batch recorded in chemical inventory | Rhodasurf L-20 Batch SP4D26X01 100% active received 08/06/23 |
| Sample | A piece of the product/component on which a Test is perfomed The Sample can be taken from:
| Raw material: methanol from Sigma Aldrich Formulation sample: combination of solvent, active ingredient and other additives Synthesized product: latex |
| Test group | A group of Tests performed on the same sample | Characterization tests |
| Test | Observation/analysis of the behavior of a Sample of a component/product when a procedure is carried out in a set of conditions :
It has an SOP (Standard Operating Procedure) | Conductivity test Viscosity test Paint application test |
| Condition | A variable/setting defined by the operator for a Test and affecting its Result | Temperature/pressure at which experiment is carried out Solute concentration |
| Result | The outcome of a Test performed on a Sample in specified Conditions It can be a numerical value, a set of numerical values (i.e. curve) or a non numerical value (i.e. observations) | pH = 8.4 Shear stress vs. Shear rate Pass/Fail |
| Result Serie | A set of Results, obtained for a Test performed the same Conditions, and measured with a time interval between them | Sample aging |
| Measure | A property of the Sample In the context of a Test it can be a Condition and/or a Result | Temperature |
Relationships definition
One Experiment can have;
- One or more Samples;
- One or more Formulations;
A Formulation can have:
- One or more Ingredients;
- One or more Formulation Batches;
A Formulation Batch can have:
- One or more Ingredients;
A Lab Activity can:
- Have one or more Processes;
- One or more Samples;
- Be associated to only one Experiment;
A Sample can:
- only be part of a Formulation Batch;
- have one or more processes associated;
- be part of one or more processes;
A Process can:
- Have/use multiples Samples;
- Be a part of a group of processes used within a Sample;
A Formulation Batch can be associated to one or more Samples;
A Test Group can:
- Only be part of one Sample
- Have one or more Tests
A Test can:
- Only be part of one Test Group
- Have one or more Conditions
- Have one or more Result Series
A Condition can:
- Only be part of one Test;
- Have one or more Measures;
A Result Serie can:
- Only be part of one Test;
- Have one or more Results;
A Result can:
- Only be part of one Result Serie;
- Have one or more Measures
A Measure can be used both as a Condition as well as a Result;
Entity-Relationship Diagram
Legend
Entity
Primary key
Foreign key
Attribute
Relationships
- The ring represents "zero"
- The dash represents "one"
- The crow's foot represents "many"
Diagram
ERD mapping with R&I workflows (WIP)
Three types of R&I workflows were distinguished
- Formulation workflows
- Synthesis workflows
- Analysis workflows
This was done in order to ensure that the ERD defined accomodates all types of R&I workflows.
The mapping done for different workflows is summarized in the table below.
| GBU/F- R&I | Workflow name | Workflow type | Mapping status | Link to mapping | Documentation |
|---|---|---|---|---|---|
| Novecare GBU | Seed Care Formulation | Formulation | Ongoing | Seed Care mapping | |
| Battery Platform | Mecanosynthesis | Synthesis | Ongoing | Mecanosynthesis mapping | ELN template |
| Aroma Performance GBU | Fermentation | Synthesis | Ongoing | Fermentation mapping | ELN spreadsheet mockup |
| BioMatTech Platform | Biodegradability | Analysis | Ongoing | Biodegradability mapping | LIMS spreadsheet mockup |
| Specialty Polymers GBU | Analysis | Ongoing | |||
| Specialty Polymers GBU | Synthesis | To do | |||
| Novecare GBU | EP Coatings | Synthesis | To do | ||
| Novecare GBU | Agro | Formulation | To do | ||
| Novecare GBU | Paint Coatings | Formulation | To do | ||
| Corporate R&I | Solvent platform - Solubilization | To do | |||
| Corporate R&I | Analysis | To do | |||
| Green Hydrogen Platform | Conductivity | Analysis | To do |
