Some of the variables of the project can be updated in an autonomous way by the users. This allows them to change some of the business rules without any action required from a developer or data scientist.

Elements of the Dataiku project used for this process are stored in the "Manual parameters update" Flow zone : SpP zoneNC zone.



The variables are sent in a dedicated GSheet (SpP fileNC file) accessible by the users through the following steps : 

Export of current variables from the Dataiku project to GSheet

The first recipe (1 in the image above) uses the dedicated defined variables (refer to the External variables update section of the variable documentation) and functions in the utils module of the code base to store the parameters in a dataset. 

This dataset is then synchronized on a hidden From_Dataiku tab of the Gsheet (2).

At the same time, we also export the most recent state of the pricing_features_dataset on the Dataiku_raw_input tab of the GSheet (3). This is useful for the users to be able to see what data is behind the technical parameter names that are used in the other tabs. It can also be used to do some testing on the data that is consumed by the models, since pricing_features_dataset contains the features used after all the pre-processing steps.

This export step is conducted at the end of any automated run using the scenario "8 - External parameters export" to keep the GSheet updated.

Update of variables in Gsheet

In the GSheet, the two main tabs for the users are the following :

The "Dataiku_raw_input" tab can also be used to view the latest data used in the Dataiku project.

Hidden tabs :

File step-by-step process for maintenance

This file contains several formulas that could require maintenance. By default, most of the cells of the user sheets are locked in order to not impact the formulas used. 

Here are the detailed processing steps of the file :

Single values

Validated families list

Hard boundaries dictionary

Note : hard-boundaries are currently not part of the variables the users can adapt themselves. It indeed can have some consequent impacts on the results that they can not monitor properly with the current tools they have access to (average number of comparables by product family for example). They are still displayed as an additional information and can be adapted through technical intervention and validation of the results.

Recommendation cap dictionary

Volume threshold dictionary

Import of GSheet updated variables in Dataiku

To import back the updated variables in Dataiku, we start by synchronizing the "To_Dataiku" tab of the GSheet to a dedicated "Parameters_import_sql" dataset (4)

Then, in a dedicated scenario "0 - External parameters import" ran at the beginning of any automated run, we run a custom Python step and use a function from the code-base to read the "Parameters_import_sql" dataset and set the Dataiku variables.


Beware that in the To_Dataiku sheet, the last record should not have a technical_path (column B) being a sub-dictionary of the variables (e.g. model.local_price_recommendation where model is a sub-level of the variables dictionary). This is a limitation of the code used to read the output and update the variables.