# 11 Ceteris-paribus Profiles

## 11.1 Introduction

Chapters 7 – 10 are related to the decomposition of a prediction \(f(x)\) into parts linked with particular variables. In this chapter we focus on methods that analyse an effect of selected variables on model response. These techniques may be used for sensitivity analysis, how stable is the model response, or to what-if analysis, how model response would change if input is changes. It is important to remember that the what-if analysis is performed not in the sense of causal modeling, but in the sense of model exploration. We need causal model to do causal inference for the real-world phenomena. Here we focus on explanatory analysis of the model behaviour. To show the difference between these two things, think about a model for survival for lung-cancer patients based on some treatment parameters. We need causal model to say how the survival would change if the treatment is changed. Techniques presented in this chapter will explore how the model result will change if the treatment is changed.

*Ceteris paribus* is a Latin phrase meaning “other things held constant” or “all else unchanged.” In this chapter, we introduce a technique for model exploration based on the *Ceteris paribus* principle. In particular, we examine the influence of each explanatory variable, assuming that effects of all other variables are unchanged. The main goal is to understand how changes in a single explanatory variable affects model predictions.

Explanation tools (explainers) presented in this chapter are linked to the second law introduced in Section 1.3, i.e. the law of “Prediction’s speculation.” This is why the tools are also known as *What-If model analysis* or *Individual Conditional Expectations* (Goldstein et al. 2015). It appears that it is easier to understand how a black-box model is working if we can explore the model by investigating the influence of explanatory variables separately, changing one at a time.

## 11.2 Intuition

Ceteris-paribus profiles show how the model response would change if a single variable is changed. For example, panel A of Figure 11.1 presents response (prediction) surface for the `titanic_lmr_v6`

model for two explanatory variables, *age* and *class*, from the *titanic* dataset (see Section 5.1). We are interested in the change of the model prediction induced by each of the variables. Toward this end, we may want to explore the curvature of the response surface around a single point with *age* equal to 47 and *class* equal to “1st,” indicated in the plot. Ceteris-paribus (CP) profiles are one-dimensional profiles that examine the curvature across each dimension, i.e., for each variable. Panel B of Figure 11.1 presents the CP profiles corresponding to *age* and *class*. Note that, in the CP profile for *age*, the point of interest is indicated by the black dot. In essence, a CP profile shows a conditional expectation of the dependent variable (response) for the particular explanatory variable.

CP belongs to the class of techniques that examine local curvature of the model response surface. Other very popular technique from this class called LIME is presented in Chapter 10. The difference between these two methods lies in the fact that LIME approximates the model of interest locally with a simpler glass-box model. Usually, the LIME model is sparse, i.e., contains fewer explanatory variables. Thus, one needs to investigate a plot across a smaller number of dimensions. On the other hand, the CP profiles present conditional predictions for a single variable and, in most cases, are easier to interpret. More detailed comparison of these techniques is presented in the Chapter 14.

## 11.3 Method

In this section, we introduce more formally one-dimensional CP profiles.

Recall (see Section 2.3) that we use \(x_i\) to refer to the vector corresponding to the \(i\)-th observation in a dataset. Let \(x^{j}_{*}\) denote the \(j\)-th element of \(x_{*}\), i.e., the \(j\)-th explanatory variable. We use \(x^{-j}_{*}\) to refer to a vector resulting from removing the \(j\)-th element from \(x_{*}\). By \(x^{j|=z}_{*}\), we denote a vector resulting from changing the value of the \(j\)-th element of \(x_{*}\) to (a scalar) \(z\).

We define a one-dimensional CP profile \(h()\) for model \(f()\), the \(j\)-th explanatory variable, and point \(x_*\) as follows:

\[\begin{equation} h^{f,j}_{x_*}(z) := f(x_*^{j|=z}). \tag{11.1} \end{equation}\]

CP profile is a function that provides the dependence of the approximated expected value (prediction) of \(Y\) on the value \(z\) of the \(j\)-th explanatory variable. Note that, in practice, \(z\) is taken to go through the entire range of values typical for the variable, while values of all other explanatory variables are kept fixed at the values specified by \(x_*\).

Note that in the situation when only a single model is considered, we will skip the model index and we will denote the CP profile for the \(j\)-th explanatory variable and the point of interest \(x_*\) by \(h^{j}_{x_*}(z)\).

## 11.4 Example: Titanic

For continuous explanatory variables, a natural way to represent the CP function is to use a profile plot similar to the ones presented in Figure 11.3. In the figure, the dot on the curves marks an instance prediction, i.e., prediction \(f(x_*)\) for a single observation \(x_*\). The curve itself shows how the prediction would change if the value of a particular explanatory variable changed.

Figure 11.3 presents CP profiles for the *age* variable in the logistic regression `titanic_lmr_v6`

and random forest model `titanic_rf_v6`

for the Titanic dataset (see Sections 5.1.2 and 5.1.3, respectively). It is worth observing that the profile for the logistic regression model is smooth, while the one for the random forest model shows more variability. For this instance (observation), the prediction for the logistic regression model would increase substantially if the value of *age* became lower than 20. For the random forest model, a substantial increase would be obtained if *age* became lower than 13 or so.

For a categorical explanatory variable, a natural way to represent the CP function is to use a barplot similar to the ones presented in Figure 11.4. The barplots in Figure 11.3 present CP profiles for the *class* variable in the logistic regression and random forest models for the Titanic dataset (see Sections 5.1.2 and 5.1.3, respectively). For this instance (observation), the predicted probability for the logistic regression model would decrease substantially if the value of *class* changed to “2nd”. On the other hand, for the random forest model, the largest change would be marked if *class* changed to “restaurant staff”.

Usually, black-box models contain a large number of explanatory variables. However, CP profiles are legible even for tiny subplots, created with techniques like sparklines or small multiples (Tufte 1986). In this way we can display a large number of profiles at the same time keeping profiles for consecutive variables in separate panels, as shown in Figure 11.5 for the random forest model for the Titanic dataset. It helps if these panels are ordered so that the most important profiles are listed first. We discuss a method to assess the importance of CP profiles in the next chapter.

## 11.5 Pros and cons

One-dimensional CP profiles, as presented in this chapter, offer a uniform, easy to communicate and extendable approach to model exploration. Their graphical representation is easy to understand and explain. It is possible to show profiles for many variables or models in a single plot. CP profiles are easy to compare, thus we can juxtapose two or more models to better understand differences between models. We can also compare two or more instances to better understand model stability. CP profiles are also a useful tool for sensitivity analysis.

But. There are several issues related to the use of the CP profiles. If explanatory variables are correlated, then changing one variable implies a change in the other. In such case, the application of the *Ceteris paribus* principle may lead to unrealistic settings, as it is not possible to keep one variable fixed while varying the other one. For example, apartment’s price prediction features like surface and number of rooms are correlated thus it is unrealistic to consider very small apartments with extremely large number of rooms. Special cases are interactions, which require the use of two-dimensional CP profiles that are more complex than one-dimensional ones. Also, in case of a model with hundreds or thousands of variables, the number of plots to inspect may be daunting. Finally, while barplots allow visualization of CP profiles for factors (categorical explanatory variables), their use becomes less trivial in case of factors with many nominal (unordered) categories (like, for example, a ZIP-code).

## 11.6 Code snippets for R

In this section, we present key features of the R package `DALEX`

which is a part of `DrWhy.AI`

universe and covers all methods presented in this chapter. Note that presented functions in fact are wrappers to package `ingredients`

(Biecek et al. 2019).

Note that there are also other R packages that offer similar functionality, like `condvis`

(O’Connell, Hurley, and Domijan 2017), `pdp`

(Greenwell 2017), `ICEbox`

(Goldstein et al. 2015), `ALEPlot`

(Apley 2018), `iml`

(Molnar, Bischl, and Casalicchio 2018).

For illustration, we use two classification models developed in Chapter 5.1, namely the logistic regression model `titanic_lmr_v6`

(Section 5.1.2) and the random forest model `titanic_rf_v6`

(Section 5.1.3). They are developed to predict the probability of survival after sinking of Titanic. Instance-level explanations are calculated for a single observation `henry`

- a 47 years old male passenger that travelled in the 1st class.

`DALEX`

explainers for both models and the `henry`

data frame are retrieved via the `archivist`

hooks as listed in Section 5.1.8.

```
library("rms")
explain_lmr_v6 <- archivist::aread("pbiecek/models/34e19")
library("randomForest")
explain_rf_v6 <- archivist::aread("pbiecek/models/6ed54")
library("DALEX")
henry <- archivist::aread("pbiecek/models/a6538")
henry
```

```
## class gender age sibsp parch fare embarked
## 1 1st male 47 0 0 25 Cherbourg
```

### 11.6.1 Basic use of the `individual_profile`

function

The easiest way to create and plot CP profiles is to call `individual_profile()`

function and then the generic `plot()`

function. By default, profiles for all variables are being calculated and all numeric features are being plotted. One can limit the number of variables that should be considered with the `variables`

argument.

To obtain CP profiles, the `individual_profile()`

function requires the explainer-object and the instance data frame as arguments. As a result, the function yields an object of the class `ceteris_paribus_explainer`

. It is a data frame with model predictions.

```
library("DALEX")
cp_titanic_rf <- individual_profile(explain_rf_v6,
new_observation = henry)
cp_titanic_rf
```

```
## Top profiles :
## class gender age sibsp parch fare embarked _yhat_ _vname_
## 1 1st male 47 0 0 25 Cherbourg 0.246 class
## 1.1 2nd male 47 0 0 25 Cherbourg 0.054 class
## 1.2 3rd male 47 0 0 25 Cherbourg 0.100 class
## 1.3 deck crew male 47 0 0 25 Cherbourg 0.454 class
## 1.4 engineering crew male 47 0 0 25 Cherbourg 0.096 class
## 1.5 restaurant staff male 47 0 0 25 Cherbourg 0.092 class
## _ids_ _label_
## 1 1 Random Forest
## 1.1 1 Random Forest
## 1.2 1 Random Forest
## 1.3 1 Random Forest
## 1.4 1 Random Forest
## 1.5 1 Random Forest
##
##
## Top observations:
## class gender age sibsp parch fare embarked _yhat_ _label_ _ids_
## 1 1st male 47 0 0 25 Cherbourg 0.246 Random Forest 1
```

To obtain a graphical representation of CP profiles, the generic `plot()`

function can be applied to the data frame returned by the `individual_profile()`

function. It returns a `ggplot2`

object that can be processed further if needed. In the examples below, we use the `ggplot2`

functions, like `ggtitle()`

or `ylim()`

, to modify plot’s title or the range of the Y-axis.

The resulting plot can be enriched with additional data by applying functions `ingredients::show_rugs`

(adds rugs for the selected points), `ingredients::show_observations`

(adds dots that shows observations), or `ingredients::show_aggreagated_profiles`

. All these functions can take additional arguments to modify size, color, or linetype.

Below we show an R snippet that can be used to replicate plots presented in the upper part of Figure 11.5.

```
library("ggplot2")
plot(cp_titanic_rf, variables = c("age", "fare")) +
ggtitle("Ceteris Paribus Profile",
"For the random forest model and the titanic dataset")
```

By default, all numerical variables are plotted.
To plot CP profiles for categorical variables, we have got to add the `variable_type = "categorical"`

argument to the `plot()`

function. The code below an be used to recreate the right-hand-side plot from Figure 11.4.

```
plot(cp_titanic_rf, variables = c("class", "embarked"),
variable_type = "categorical") +
ggtitle("Ceteris Paribus profile",
"For the random forest model and the titanic dataset")
```

### 11.6.2 Advanced use of the `individual_profile`

function

The `individual_profile()`

is a very flexible function. To better understand how it can be used, we briefly review its arguments.

`x`

,`data`

,`predict_function`

,`label`

- information about a model. If`x`

is created with the`DALEX::explain`

function, then other arguments are extracted from`x`

; this is how we use the function in this chapter. Otherwise, we have got to specify directly the model, the validation data, the predict function, and the model label.`new_observation`

- instance (one or more), for which we want to calculate CP profiles. It should be a data frame with same variables as in the validation data.`y`

- observed value of the dependent variable for`new_observation`

. The use of this argument is illustrated in Section 13.1.`variables`

- names of explanatory variables, for which CP profiles are to be calculated. By default, the profiles will be constructed for all variables, which may be time consuming.`variable_splits`

- a list of values for which CP profiles are to be calculated. By default, these are all values for categorical variables. For continuous variables, uniformly-placed values are selected; one can specify the number of the values with the`grid_points`

argument (the default is 101).

The code below allows to obtain the plots in the upper part of Figure 11.5. The argument `variable_splits`

specifies the variables (`age`

and `fare`

) for which CP profiles are to be calculated, together with the list of values at which the profiles are to be evaluated.

```
cp_titanic_rf <- individual_profile(explain_rf_v6, henry,
variable_splits = list(age = seq(0, 70, 0.1),
fare = seq(0, 100, 0.1)))
```

```
plot(cp_titanic_rf, variables = c("age", "fare")) +
ylim(0, 1) +
ggtitle("Ceteris Paribus profile",
"For the random forest model and the titanic dataset")
```

To enhance the plot, additional functions can be used. The generic `plot()`

function creates a `ggplot2`

object with a single `geom_line`

layer. Function `show_observations`

adds `geom_point`

layer, `show_rugs`

adds `geom_rugs`

, while `show_profiles`

adds another `geom_line`

. All these functions take, as the first argument, an object created with the `ceteris_paribus`

function. They can be combined freely to superimpose profiles for different models or observations.

In the example below, we present the code to create CP profiles for two passengers, `henry`

and `johny_d`

. Their profiles are included in a plot presented in Figure 11.9. We use the `scale_color_manual`

function to add names of passengers to the plot, and to control colors and positions.

```
johny_d <- archivist::aread("pbiecek/models/e3596")
cp_titanic_rf2 <- variable_profile(explain_rf_v6, rbind(henry, johny_d))
```

```
library(ingredients)
plot(cp_titanic_rf2, color = "_ids_", variables = c("age", "fare")) +
scale_color_manual(name = "Passenger:", breaks = 1:2,
values = c("#4378bf", "#8bdcbe"),
labels = c("henry" , "johny_d")) +
ggtitle("Ceteris Paribus profile",
"For the random forest model and the titanic dataset")
```

### 11.6.3 Champion-challenger analysis

One of the most interesting uses of the explainers is comparison of CP profiles for two or more of models.

To illustrate this possibility, first, we have go to construct profiles for the models. In our illustration, for the sake of clarity, we limit ourselves just to two models: the logistic regression and random forest models for the Titanic data. Moreover, we only consider the `age`

and `fare`

variables. We use `henry`

as the instance, for which predictions are of interest.

```
cp_titanic_rf <- ceteris_paribus(explain_rf_v6, henry)
cp_titanic_lmr <- ceteris_paribus(explain_lmr_v6, henry)
```

Subsequently, we construct the plot. The result is shown in Figure 11.10. Predictions for `henry`

are slightly different, logistic regression returns in this case higher predictions then random forest. For `age`

variable profiles of both models are similar, in both models we see decreasing dependency. While for `fare`

the logistic regression model is slightly positive while random forest is negative. The larger the `fare`

the larger is difference between these models. Such analysis helps us to which degree different models agree on what if scenarios.

Note that every `plot`

and `show_*`

function can take a collection of explainers as arguments. Profiles for different models are included in a single plot. In the presented R snippet, models are color-coded with the help of the argument `color = "_label_"`

, where `_label_`

refers to the name of the column in the CP explainer that contains the model label.

```
plot(cp_titanic_rf, cp_titanic_lmr, color = "_label_",
variables = c("age", "fare")) +
ggtitle("Ceteris Paribus Profiles for Henry")
```

### References

Apley, Dan. 2018. *ALEPlot: Accumulated Local Effects (Ale) Plots and Partial Dependence (Pd) Plots*. https://CRAN.R-project.org/package=ALEPlot.

Biecek, Przemyslaw, Hubert Baniecki, Adam Izdebski, and Katarzyna Pekala. 2019. *ingredients: Effects and Importances of Model Ingredients*.

Goldstein, Alex, Adam Kapelner, Justin Bleich, and Emil Pitkin. 2015. “Peeking Inside the Black Box: Visualizing Statistical Learning with Plots of Individual Conditional Expectation.” *Journal of Computational and Graphical Statistics* 24 (1): 44–65. https://doi.org/10.1080/10618600.2014.907095.

Greenwell, Brandon M. 2017. “pdp: An R Package for Constructing Partial Dependence Plots.” *The R Journal* 9 (1): 421–36. https://journal.r-project.org/archive/2017/RJ-2017-016/index.html.

Molnar, Christoph, Bernd Bischl, and Giuseppe Casalicchio. 2018. “iml: An R package for Interpretable Machine Learning.” *Joss* 3 (26). Journal of Open Source Software: 786. https://doi.org/10.21105/joss.00786.

O’Connell, Mark, Catherine Hurley, and Katarina Domijan. 2017. “Conditional Visualization for Statistical Models: An Introduction to the Condvis Package in R.” *Journal of Statistical Software, Articles* 81 (5): 1–20. https://doi.org/10.18637/jss.v081.i05.

Tufte, Edward R. 1986. *The Visual Display of Quantitative Information*. Cheshire, CT, USA: Graphics Press.