Volume being excluded from the comparable selection, we can have big gaps between a target and its comparables. That is why we need a specific volume adjustment step.
The objective is to adjust the prices of comparables to answer the following question : "what would be the comparable cpc price if it was sold at the same volume as the target cpc ?"
To do this, we apply the following two methods:
Adjustment function
Example for Specialty Monomers:
- We start from the SHAP value of the volume (please refer to section 4 above).
Each dot here represents a CPC and the x-axis (SHAP value of cpc_volume_log) goes from -0.09 to +0.06
- We add the volume as a second dimension on the X axis to create a scatterplot.
- On the Y-axis, we retrieve our SHAP values with the scale between -0.09 and +0.04.
- Each dot still represents a CPC.
- We can see that the SHAP values decreases as volume increase.
- We find the best fit function to the scatterplot and display it as a curve (in yellow above).
- Finally, we apply the log inverse function (10^x) to the y-axis to obtain a result that we can interpreted. The y-axis value is now a price variation due to volume, compared to the average prices of the CPC of the family.
Note : the shape of the curve does not change, only the Y-axis is modified
- Then, we adjust every comparable using the fit function.
Group adjustment
We mentioned before the "cpc_volume_log" feature which is showing the volume of a CPC. We also have a feature named "group_volume_but_cpc_label" looking at the total volume of the customer group of the CPC, excluding volume of the CPC itself.
Indeed, the size of the entire group is supposed to have an impact on the CPC price. That is why this group adjustment step is needed.
This "group_volume_but_cpc_label" variable has 5 modalities :
- 0_one_cpc : The volume of the group for this family excluding the current CPC is equal to zero.
This means that the group of the customer only buys this CPC and the adjustment needed is entirely covered by the step presented in section 5. - Other CPCs whose groups have sales in the family (excluding the CPC ones) are split in 4 bins representing quartiles :
- 1_small : 25% of the CPC having the smallest group volumes (first quartile)
- 2_medium : 25% of the CPC representing the second quartile
- 3_big : 25% of the CPC representing the third quartile
- 4_top: 25% of the CPC having the biggest group volumes (fourth quartile)
Once every CPC is placed in its modality, we compute the median SHAP value of "group_volume_but_cpc_label" for every modality. This gives us the estimated impact of the group volume on price.

