You will find the term "CPC" a lot in the following documentation as it is the main level of granularity used through the application. This means that most of our datasets will have one record by CPC of a given GBU.
The Customer Product Combination (CPC) is the identifier representing a specific product sold to a specific customer.
The product and customer definition here varies from one GBU to another.
For example :
The target represents the data we are trying to optimize, in this case the unit price of a CPC (Customer Product Combination).
The unit price we use in our models is the result of several computation made in our data preparation steps, we could summarize it as follows :
For Novecare:
For SpP:
The rule used to define the unit price is as follows:
Note : as of now, this unit price includes all costs : fixed and variable.
More details are available below on how these computations are included in our global data preparation flow.
To select the final list of the most relevant price drivers, we collected, built and tested more than 50 features:
These price drivers are coming from several data sources described below.
The main data source we are currently using is the Pricing Data Lake in Big Query, especially the two following datasets :
These datasets include :
All manual data sources are gathered in a spreadsheet specified by GBU and under the responsibility of the business.
For Novecare, this following GSheet is used to process the manual inputs for product groupings, manual regions and manufacturing plant groups.
For SpP, this following GSheet is used to process the manual inputs for product taxonomy and this one for manual regions.
Here are the steps to follow to use this manual process :
Let's take the example of the "Regions" tab :
We can see that new records appeared with countries that do not have a manual_region value for the given product_family_h4.