Chapter 5 Ceteris Paribus Profiles

See a short 100 secs introduction to the package on YouTube.

Ceteris Paribus 100 sec introduction

NOTE: This chapter is still a work in progress. Expect changes.

In this chapter we introduce a model agnostic method for model exploration based on the Ceteris Paribus principle.1 Ceteris paribus is a Latin phrase meaning “other things held constant” or “all else unchanged”. This universal method helps to understand both local and global structure of a model, helps to compare few models and is useful in diagnostic of model fit.

Some specific versions of this method are in literature under different names, as Partial Dependence Plots (Greenwell 2017Greenwell, Brandon M. 2017. “Pdp: An R Package for Constructing Partial Dependence Plots.” The R Journal 9 (1):421–36. https://journal.r-project.org/archive/2017/RJ-2017-016/index.html.), Individual Conditional Expectation Plots (ICEPlots) or Accumulated Local Effects Plots (Apley 2017Apley, Dan. 2017. ALEPlot: Accumulated Local Effects (Ale) Plots and Partial Dependence (Pd) Plots. https://CRAN.R-project.org/package=ALEPlot.). Here we present an uniform yet more extensible approach. We adopt a new name for this approach, namely Ceteris Paribus Profiles, as it better described the idea behind.

This chapter is divided into 6 sections. First three sections introduce Ceteris Paribus profiles in different contexts, exploration for a single observation, exploration of local structure and exploration of global structure of a model.

  • Section 5.1 introduces Ceteris Paribus profiles for a single observation. This section introduces basic concepts and notation behind CPP.

  • Section 5.2 show how to combine set of CP profiles around a single data point in order to inspect local structure of a model. This allows to assess model stability, additivity and local fit.

  • Section 5.3 show how to combine CP profiles for all observations in order to inspect global features of a model.

Ceteris Paribus profiles calculated for different subsets or different models can be combined and this gives new opportunities for the model exploration. In following three sections we present three most common combinations of CP profiles.

  • Section 5.4 shows how CP profiles calculated for different subsets of data can be aligned in a single plot.

  • Section 5.5 shows how CP profiles can be used for multiclass models or model with multivariate response.

  • Section 5.6 shows how CP profiles can be used for model cross-comparisons. Examples are related to single observation, local and global explanations.

5.0.0.1 HR Dataset

In this chapter we show examples for three predictive models trained on apartments dataset from the DALEX package. Random Forest model (elastic but biased), Support Vector Machines model (large variance on boundaries) and Linear Model (stable but not very elastic). Presented examples are for regression (prediction of square meter price), but the CP profiles may be used in the same way for classification.

For these models we use DALEX explainers created with explain() function. There exapliners wrap models, predict functions and validation data.

Examples presented in this chapter are generated with the ceterisParibus package in version 0.3.0.