In order to expedite the process of making Seedcare data available in the SyDatalab web application, and in anticipation of the expected migration to the Azure data platform with an associated freeze on further data related work in GCP, an approach was agreed to create and use custom views that did not follow the Data Ocean common data model.

As part of this approach, we created traditional views directly from the ODS data in the Seedcare datalake GCP projects that included only the attributes to be used in SyDatalab. In addition, in order to make the views less complex, intermediate views were created to organize the necessary data before combining into the final views. Finally, due to certain restrictions and permission limitations in the GCP Data Ocean production environment, we started with a “redirection” approach, where views in the production GCP Data Ocean environment that were used by the SyDatalab application, pointed back to the GCP Test Data Ocean environment that help data from the production Seedcare datalake. The data flow is shown in the architecture diagrams below.

As part of the redirection, a new dataset was created in the Data Ocean projects with the name “DS_Seed”. This is where the intermediate views are created.

Code Management

Repository with all SQL statements: https://gitlab.syensqo.com/syensqo-connected-research/lab-booster/data-architecture/data-engineering


Repository sub-folder with SQL for intermediate views: https://gitlab.syensqo.com/syensqo-connected-research/lab-booster/data-architecture/data-engineering/-/tree/master/BigQuery/data_ocean_domain_rni/DS_Seed?ref_type=heads

seed_intermediate_view_DEV-TEST_LB_7027_formulations_static_20251219.sql - The SQL statement for the intermediate view in the Data Ocean DEV environment that queries the formulation static data from the Seedcare TEST datalake.

seed_intermediate_view_DEV-TEST_LB_7027_formulations_test_results_stability_20251219.sql - The SQL statement for the intermediate view in the Data Ocean DEV environment that queries the formulation stability test data from the Seedcare TEST datalake.

seed_intermediate_view_DEV-TEST_LB_7027_formulations_test_results_t0_20251219.sql - The SQL statement for the intermediate view in the Data Ocean DEV environment that queries the formulation T0 test data from the Seedcare TEST datalake.


seed_intermediate_view_TEST_LB_7027_formulations_static_20251219.sql - The SQL statement for the intermediate view in the Data Ocean TEST environment that queries the formulation static data from the Seedcare TEST datalake.

seed_intermediate_view_TEST_LB_7027_formulations_test_results_stability_20251219.sql - The SQL statement for the intermediate view in the Data Ocean TEST environment that queries the formulation stability test data from the Seedcare TEST datalake.

seed_intermediate_view_TEST_LB_7027_formulations_test_results_t0_20251219.sql - The SQL statement for the intermediate view in the Data Ocean TEST environment that queries the formulation T0 test data from the Seedcare TEST datalake.


seed_intermediate_view_TEST-PROD_LB_7027_formulations_static_20251219.sql - The SQL statement for the intermediate view in the Data Ocean TEST environment that queries the formulation static data from the Seedcare PROD datalake.

seed_intermediate_view_TEST-PROD_LB_7027_formulations_test_results_stability_20251219.sql - The SQL statement for the intermediate view in the Data Ocean TEST environment that queries the formulation stability test data from the Seedcare PROD datalake.

seed_intermediate_view_TEST-PROD_LB_7027_formulations_test_results_t0_20251219.sql - The SQL statement for the intermediate view in the Data Ocean TEST environment that queries the formulation T0 test data from the Seedcare PROD datalake.


seed_intermediate_view_DEV-TEST_LB_7036_request_result_results_20251215.sql - The SQL statement for the intermediate view in the Data Ocean DEV environment that queries the request & results test data from the Seedcare TEST datalake.

seed_intermediate_view_DEV-TEST_LB_7036_request_result_static_20251215.sql -The SQL statement for the intermediate view in the Data Ocean DEV environment that queries the request & results static data from the Seedcare TEST datalake.


seed_intermediate_view_TEST_LB_7036_request_result_results_20251215.sql - The SQL statement for the intermediate view in the Data Ocean TEST environment that queries the request & results test data from the Seedcare TEST datalake.

seed_intermediate_view_TEST_LB_7036_request_result_static_20251215.sql - The SQL statement for the intermediate view in the Data Ocean TEST environment that queries the request & results static data from the Seedcare TEST datalake.


seed_intermediate_view_TEST-PROD_LB_7036_request_result_results_20251215.sql -The SQL statement for the intermediate view in the Data Ocean TEST environment that queries the request & results test data from the Seedcare PROD datalake.

seed_intermediate_view_TEST-PROD_LB_7036_request_result_static_20251215.sql -The SQL statement for the intermediate view in the Data Ocean TEST environment that queries the request & results static data from the Seedcare PROD datalake.


Repository sub-folder with SQL statements for views used by SyDatalab: https://gitlab.syensqo.com/syensqo-connected-research/lab-booster/data-architecture/data-engineering/-/tree/master/BigQuery/data_ocean_domain_rni/DS_Datalab?ref_type=heads

LB_7027_DEV_vw_seed_formulation_results_20251205.sql - The SQL statement for the SyDatalab view in the Data Ocean DEV environment that returns formulation test data from the Seedcare TEST datalake.

LB_7027_DEV_vw_seed_formulation_static_20251205.sql - The SQL statement for the SyDatalab view in the Data Ocean DEV environment that returns formulation static data from the Seedcare TEST datalake.


LB_7027_TEST_vw_seed_formulation_results_20251204.sql - The SQL statement for the SyDatalab view in the Data Ocean TEST environment that returns formulation test data from the Seedcare TEST datalake.

LB_7027_TEST_vw_seed_formulation_static_20251204.sql - The SQL statement for the SyDatalab view in the Data Ocean TEST environment that returns formulation static data from the Seedcare TEST datalake.


LB_7027_TEST-PROD_vw_seed_formulation_results_20251205.sql - The SQL statement for the SyDatalab view in the Data Ocean TEST environment that returns formulation test data from the Seedcare PROD datalake.

LB_7027_TEST-PROD_vw_seed_formulation_static_20251204.sql - The SQL statement for the SyDatalab view in the Data Ocean TEST environment that returns formulation test data from the Seedcare PROD datalake.


LB_7027_PROD_Seed_prod_vw_seed_formulation_results_20251219.sql - The SQL statement for the SyDatalab view in the Data Ocean PROD environment that redirects to the equivalent view in the TEST environment for formulation test data.

LB_7027_PROD_Seed_prod_vw_seed_formulation_static_20251219.sql - The SQL statement for the SyDatalab view in the Data Ocean PROD environment that redirects to the equivalent view in the TEST environment for formulation static data.


LB_7036_DEV_vw_seed_request_result_results_20251215.sql - The SQL statement for the SyDatalab view in the Data Ocean DEV environment that returns request & results test data from the Seedcare TEST datalake.

LB_7036_DEV_vw_seed_request_result_static_20251215.sql - The SQL statement for the SyDatalab view in the Data Ocean DEV environment that returns request & results static data from the Seedcare TEST datalake.


LB_7036_TEST_vw_seed_request_result_results_20251215.sql - The SQL statement for the SyDatalab view in the Data Ocean TEST environment that returns request & results test data from the Seedcare TEST datalake.

LB_7036_TEST_vw_seed_request_result_static_20251215.sql - The SQL statement for the SyDatalab view in the Data Ocean TEST environment that returns request & results static data from the Seedcare TEST datalake.


LB_7036_TEST-PROD_vw_seed_request_result_results_20251215.sql - The SQL statement for the SyDatalab view in the Data Ocean TEST environment that returns request & results test data from the Seedcare PROD datalake.

LB_7036_TEST-PROD_vw_seed_request_result_static_20251215.sql - The SQL statement for the SyDatalab view in the Data Ocean TEST environment that returns request & results test data from the Seedcare PROD datalake.


LB_7036_PROD_vw_seed_request_result_results_20251218.sql - The SQL statement for the SyDatalab view in the Data Ocean PROD environment that redirects to the equivalent view in the TEST environment for request & results test data.

LB_7036_PROD_vw_seed_request_result_static_20251218.sql - The SQL statement for the SyDatalab view in the Data Ocean PROD environment that redirects to the equivalent view in the TEST environment for request & results static data.


Data sources

Seedcare Test datalake dataset: gcp-sqo-datalab-t.bq_ds_datagrow_test_ods_ag_seed

Seedcare Prod datalake dataset: gcp-sqo-datalab-p.bq_ds_datagrow_prod_ods_ag_seed


Mapping document

The mapping document specifies the Seedcare data elements that are required for visualization in the SyDatalab web application, their source location, and their corresponding database column names in the final views.


Data Model

The data model diagram shows all the tables where the transformed source data for Seedcare is stored in the legacy datalake (ODS). The custom views were created using the PK/FK relationships illustrated in the data model.

https://lucid.app/lucidchart/022c34dc-c40d-40d4-bbbe-7bf424f49914/edit?invitationId=inv_9e4224d9-6189-4102-8049-c937a085108b&page=0_0#


SyDatalab Seedcare formulations views

The SyDatalab web application typically expects 2 views for each workflow, a static view and a results view. For Seedcare Formulations, these views are:

vw_seed_formulation_static - includes general data about the experiments, recipes, recipe compositions, batches, batch compositions, and samples.

vw_seed_formulation_results - includes all results data from the T0 and Stability tests on the samples from the experiments.


The exact views created in the Data Ocean environments/projects are:

Seedcare Test static data in Data Ocean Dev environment: gcp-sqo-data-dm-ri-d.DS_Datalab.vw_seed_formulation_static

Seedcare Test results data Data Ocean Dev environment: gcp-sqo-data-dm-ri-d.DS_Datalab.vw_seed_formulation_results


Seedcare Test static data in Data Ocean Test environment: gcp-sqo-data-dm-ri-t.DS_Datalab.vw_seed_formulation_static

Seedcare Test results data in Data Ocean Test environment: gcp-sqo-data-dm-ri-t.DS_Datalab.vw_seed_formulation_results


Seedcare Prod static data in Data Ocean Test environment: gcp-sqo-data-dm-ri-p.DS_Datalab.vw_seed_formulation_static_prod

Seedcare Prod results data in Data Ocean Test environment: gcp-sqo-data-dm-ri-p.DS_Datalab.vw_seed_formulation_results_prod


CURRENTLY REDIRECTNG TO TEST ENVIRONMENT gcp-sqo-data-dm-ri-p.DS_Datalab.vw_seed_formulation_static_prod & gcp-sqo-data-dm-ri-p.DS_Datalab.vw_seed_formulation_results_prod ]

Seedcare Prod static data in Data Ocean Prod environment: gcp-sqo-data-dm-ri-p.DS_Datalab.vw_seed_formulation_static

Seedcare Prod results data in Data Ocean Prod environment: gcp-sqo-data-dm-ri-p.DS_Datalab.vw_seed_formulation_results


Dues to restrictions and permission limitations in the Data Ocean Prod environment, a "redirect" approach was used to allows the SyDatalab web application to query the views defined in the Data Ocean Prod environment, but those views actually query Seedcare Production data via the intermediate views in the Data Ocean Test environment. This redirection is shown in the architecture diagrams below.


Intermediate Views

The intermediate views combine the parts of Seedcare data before combining them in the final views, to simplify the SQL statement for the final views.


Seedcare Test T0 test results in Data Ocean Dev environment: gcp-sqo-data-dm-ri-d.DS_Seed.seed_formulation_t0_test_results

Seedcare Test Stability test results in Data Ocean Dev environment: gcp-sqo-data-dm-ri-d.DS_Seed.seed_formulation_stability_test_results

Seedcare Test static data in Data Ocean Dev environment: gcp-sqo-data-dm-ri-d.DS_Seed.seed_formulation_static


Seedcare Test T0 test results in Data Ocean Test environment: gcp-sqo-data-dm-ri-t.DS_Seed.seed_formulation_t0_test_results

Seedcare Test Stability test results in Data Ocean Test environment: gcp-sqo-data-dm-ri-t.DS_Seed.seed_formulation_stability_test_results

Seedcare Test static data in Data Ocean Test environment: gcp-sqo-data-dm-ri-t.DS_Seed.seed_formulation_static


Seedcare Prod T0 test results in Data Ocean Test environment: gcp-sqo-data-dm-ri-t.DS_Seed.seed_formulation_t0_test_results_prod

Seedcare Prod Stability test results in Data Ocean Test environment: gcp-sqo-data-dm-ri-t.DS_Seed.seed_formulation_stability_test_results_prod

Seedcare Prod static data in Data Ocean Test environment: gcp-sqo-data-dm-ri-t.DS_Seed.seed_formulation_static_prod



Architecture diagrams

The architecture diagrams that illustrate the custom views approach for Seed can be found here: https://lucid.app/lucidchart/c8f38082-e408-4c4e-9667-eb6a0e8a26b5/edit?page=Jrf2xadOjKxW&invitationId=inv_5d4d6c01-38c2-4dd8-92e4-0c5c87c3c7c5#

The image below is an export form the online Lucid diagrams as of 12 December 2025, and inserted here for convenience. The below diagram illustrates the current (as of 12 Dec 2025) deployment for the SyDatalab production application.





  • No labels