The aim of this documentation is to provide key information on the migration of SP19 Dataiku projects such as :

This document describes the Dataiku migration projects “AS IS” following the Technical Design and Projects Assessment phases.

Pre-migration: Projects import from Solvay to Syensqo (1)

The goal is to transfer all Dataiku projects from Solvay to Syensqo's DSS instances (Design). For that, massive import script is used to transfer Self-Service and automation projects from Solvay DSS to Syensqo's DSS.

Project type:  there are two types of project: Automation projects and Self-service projects                

Consulte this link to have more details about Dataiku Migration Strategies → 


These links to access SP19 Dataiku migration global information (Scope, data sources, scenarios, planning,...) →

 


For more details about SP19 Dataiku projects migration planning check in the following link →


The main steps of Dataiku projects migration from Solvay to Syensqo physical environment are the following:

Network Resquets: Solvay & Syensqo (2)

Opening ports, data flow, database access, obtaining access and certificate,... to etablish Syensqo contctions in Syensqo's DSS (Design/ UAT and Automation).

The global list of connection from Solvay's DSS of each instance is extracted to request openning flow and database acces to set up these connections in Syensqo DSS intances. Here the links to access thes connections list (Automation &nd Self-service projects):


 

Create Syensqo connection in design/UAT/prod (3)

New connections intended for use by Syensqo were created and tested across the three environments: test (Design), pre-production (UAT), and production (Automation). The login information from these new connections was used to update the configuration of the migrated connections currently used by the project.

This approach was preferred to allow modification of the connection objects only, without impacting the datasets. It also ensured that the original connection names were preserved, in line with the "AS IS" principle.

The newly created connections were used exclusively for the MySQL connections in the Mappy projects. This decision was made because Solvay operates only in pre-production and production environments, while Syensqo includes test, pre-production, and production. Using the same connection names across these differing environments would have led to confusion, as it would imply identical names for connections pointing to different servers.

Migration (4)

In this step, Dataiku projects in Syensqo DEV environment will be still connected to their sources in Solvay environment. All connections of each automation project are test and fix from Dataiku graphical projects components, code envs (python code) and plugins.

                     Example: It is the 7th of may, the scenario hasn't run so the data from BW in the project is the data until the 11th of april, so the project is still using april's values to function.

                     If it is the 12th of may, project will use the data from may because scenario will have ran.

The project was migrated from the Solvay automation node due to a significant delta between the design and automation environments. the design version was also transferred to Syensqo, using the _design suffix to distinguish it. During the migration, the connections were remapped to those intended for design use. Following the import, the Salesforce and Google Sheets connections were manually adjusted to align with the new configuration.

To validate the success of the migration, we relied on the activated scenarios in the production environment. Some zones and branches are no longer maintained and do not run even in the Solvay automation node; these elements were excluded from the testing process. Our validation focused on the following four key scenarios: run_all, send_to_gb, update_sfdc, and update_users.

Three out of the four scenarios executed successfully, with the exception of send_gbr. This scenario involves a Python recipe that generates a dataset using the pyrfc package, which is provided by SAP and is no longer actively maintained. Reinstalling this package required a custom setup beyond a standard pip installation, and identifying the correct configuration took considerable time.

After resolving the pyrfc issue, we encountered a network-related problem. Given that the generated dataset is not used by any other component in the project, a decision was made to exclude it from further consideration.

Warning: 

Executing the full flow or certain zones within the flow may result in a crash due to the presence of recipes that use the same dataset as both input and output, which creates an infinite loop.


The project was migrated from the Solvay automation node due to a significant delta between the design and automation environments. the design version was also transferred to Syensqo, using the _design suffix to distinguish it. During the migration, the connections were remapped to those intended for design use. Following the import, the sap call in the scripts and the Google Sheets connections were manually adjusted to align with the new configuration.

The scenarios run_monthly_history_D-5, run_monthly_history_D+2, Weekend's report, and Yesterday's report were used to validate the migration. It is important to note that the project contains two distinct flow branches: one for Solvay data and another for Syensqo data. For the purpose of this validation, only the Syensqo branch was considered, and the Solvay branch was excluded. All selected scenarios executed successfully, confirming the integrity of the migration.

Warning: 



In this step, all connections in Syensqo's DSS design are swicht on Syensqo's one. For that the equivalent of Solvay connections are set up  with Syensqo's ressources created with flow openning.

Test phase (5)



Deployment (Automation Env.) & technical test (6)

Deployment is only carried out for automation projects after transfer from Solvay's Design instance to the Syensqo design instance, migration and testing phases in the Syensqo UAT instance. After business validation, projects are moved from the Syensqo UAT instance to the Syensqo automation instance. All connections are updated to use Syensqo automation resources (data sources: BW, BigQuery, Salesforce, ProsgreSQL, etc.).

After deploying projects in automation, the aim of this stage is to ensure that all connections to Syensqo's data sources are operational, except for the BW source.

Once the BW source has been commissioned, the aim is to ensure that all connections to Syensqo's BW data sources are operational, and then to test the project scenarios or planned schedule to test the end-to-end execution of the ech automation project.

Self-service projects migration & support (7)

As part of the SP19 Dataiku migration, Self-Service projects are migrated from Solvay's Design instance to Syensqo's Design instance. Only the main connections to Syensqo resources are set up by the migration teams, while the remaining tasks must be carried out by the project owners. The migration teams provide back-up for all self-service project owners in setting up their projects and resolving connection problems.

Here the list of dataiku Self_Service projects migrated in SP19 Dataiku projects migration.


Go Live (Syensqo) (8)

The Go of Dataiku  projects  were Live on 8 May. 

Useful links:

 List of projects that have undergone a change of ownership in the Syensqo DSS after migration from Solvay to Syensqo:

 Keeppass: to store dataiku instances sensitive data .

LeanIX applications migration (9)

In addition to the dataiku projects in the Solvay DSS, two dataiku servers are also involved in Solvay's migration to Syensqo which : Spinetta and Tavaux.  The migration of these servers at infrastructure level (DNS & PKI Certificates,...) was managed by support teams: "se-server-support.team@solvay.com".

This link to have more details about LeanIX applications migration:

Spinetta:

On this server, two projects have been reconfigured to use Syensqo resources: GWF_SYSTEM and SPINETTAHYDRAULICBARRIERS_SYENSQO (WebGis).

Not found project:

SPP - Small customer analytics is the only project not found on any dataiku servers.