Chapter 24 Ceteris-paribus Two-dimensional Profiles - a Tool for Pairwise Interactions

24.1 Introduction

The definition of Ceteris-paribus (CP) profiles, given in Section 6, may be easily extended to two or more explanatory variables. Also, the definition of the variable importance measure \(vip^{CP}_j(x^*)\) have a straightforward extension for a larger number of variables. The extensions are useful to identify or visualize pairwise interactions between explanatory variables.

24.2 Intuition

Figure 24.1 presents response (prediction) surface for the titanic_lmr_v6 model for two explanatory variables, age and sibsp, from the titanic dataset (see Section 4.1). We are interested in the change of the model prediction induced jointly by the variables.

[TOMASZ: THIS IS A BIT WEAK. WHAT INTUITIVE IS ABOUT THE PLOT? WHAT CAN BE SEEN DIFFERENTLY THAN IN AN 1D CP PROFILE? WHICH STRUCTURE WOULD WE LOOK FOR?]

(fig:profile2d) Ceteris-paribus profile for `age` and `sibsp` explanatory variables for the `titanic_lmr_v6` model.

Figure 24.1: (fig:profile2d) Ceteris-paribus profile for age and sibsp explanatory variables for the titanic_lmr_v6 model.

24.3 Method

The definition of one-dimensional CP profiles (see Section 6.3) may be easily extended to two or more explanatory variables. A two-dimensional CP profile for model \(f()\), explanatory variables \(j\) and \(k\), and point \(x^*\) is defined as follows:

\[ CP^{f, (j,k), x^*}(z_1, z_2) \equiv f(x^*|^{(j,k)} = (z_1,z_2)). \]

Thus, a two-dimensional (2D) CP profile is a function that provides the dependence of the instance prediction of the model on the values of \(j\)-th and \(k\)-th explanatory variables \(Z_1\) and \(Z_2\), respectively. The values of \(Z_1\) and \(Z_2\) are taken to go through the range of values typical for the variables. All other explanatory variables are kept fixed at the values given by \(x^*\).

The corresponding variable importance measure is defined as follows: \[ vip^{CP}_{j,k}(x^*) = \int_{\mathcal R}\int_{\mathcal R} |CP^{f,(j,k),x^*}(z_1,z_2) - f(x^*)| g^{j,k}(z_1,z_2)dz_1dz_2=E_{X_j,X_k}[|CP^{f,j,x^*}(X_j,X_k) - f(x^*)|], \] where the expected value is taken over the joint distribution of the \(j\)-th and \(k\)-th explanatory variable.

Such multi-dimensional extensions are useful to check if, for instance, the model involves interactions. In particular, presence of pairwise interactions may be detected with 2D CP profiles.

24.4 Example: Titanic data

A natural way to visualize 2D CP profiles is to use a heat map for all pairs of explanatory variables as, in Figure 24.2.

(fig:profile2dAll) Two-dimensional ceteris-paribus profiles for all pairs of explanatory variables for the `titanic_lmer_v6` model. Black-cross marks the point of interest.

Figure 24.2: (fig:profile2dAll) Two-dimensional ceteris-paribus profiles for all pairs of explanatory variables for the titanic_lmer_v6 model. Black-cross marks the point of interest.

If the number of pairs of explanatory variables is small or moderate, then it is possible to present 2D CP profiles for all pairs of variables.

If the number of pairs is large, we can use the variable importance measure to order the pairs based on their importance and select the most important pairs for purposes of illustration.

[TOMASZ: WE SHOULD INCLUDE HERE A MORE SUBSTANTIVE DISCUSSION REFERRING TO “HENRY”.]

24.5 Pros and cons

Two-dimensional CP profiles can be used to identify the presence and the influence of pairwise interactions in a model. However, for models with a large number of explanatory variables, the number of pairs will be large. Consequently, inspection of all possible 2D CP profiles may be challenging. Moreover, the profiles are more difficult to read and interpret than the 1D CP profiles.

[TOMASZ: 2D CP PROFILES FOR FACTORS?]

24.6 Code snippets for R

In this section, we present key features of the R package ingredients (Biecek 2019a) which is a part of DALEXverse and covers all methods presented in this chapter. More details and examples can be found at https://modeloriented.github.io/ingredients/.

There are also other R packages that offer similar functionality, like condvis (O’Connell, Hurley, and Domijan 2017) or ICEbox (Goldstein et al. 2015b).

We use the random forest model titanic_rf_v6 developed for the Titanic dataset (see Section @ref(model_titanic_rf)) as the example. Recall that we deal with a binary classification problem - we want to predict the probability of survival for a selected passenger.

First, we have got to create a wrapper around the model (see Section 6.6).

To calculate oscillations we need to first calculate CP profiles for the selected observation. Let us use henry as the instance prediction of interest.

[TOMASZ: WHY NOT USING THE PRE-DEFINED DATA FRAME?]

2D profiles are calculated by applyiing the ceteris_paribus_2d() function to the wrapper object. By default, all pairs of continuous explanatory variables are used, but one can limit number of variables for consideration through the variables argument. [TOMASZ: FACTORS?]

As a result, we obtain an object of class ceteris_paribus_2d_explainer with overloaded print() and plot() functions. We can use the latter function to obtain plots of the constructed 2D CP profilest.

[TOMASZ: LABELLING OF THE AXES COULD BE IMPROVED. IT IS UNCLEAR WHICH VARIABLES DEFINE THE Y- AND X AXES. ]

The plot suggests that age and sibsp importantly influence the model response. [TOMASZ: WHY? WHICH FEATURE OF THE PLOTS DISTIGUISHES THIS PAIR FROM THE THREE OTHERS?]

[TOMASZ: WE SHOULD DISCUSS “HENRY” IN THE EXAMPLE SECTION. IN THE SNIPPETS, WE SHOULD SIMPLY SHOW THE UNDERLYING CODE.]

References

Biecek, Przemyslaw. 2019a. Ingredients: Effects and Importances of Model Ingredients. https://ModelOriented.github.io/ingredients/.

O’Connell, Mark, Catherine Hurley, and Katarina Domijan. 2017. “Conditional Visualization for Statistical Models: An Introduction to the Condvis Package in R.” Journal of Statistical Software, Articles 81 (5): 1–20. https://doi.org/10.18637/jss.v081.i05.

Goldstein, Alex, Adam Kapelner, Justin Bleich, and Emil Pitkin. 2015b. “Peeking Inside the Black Box: Visualizing Statistical Learning with Plots of Individual Conditional Expectation.” Journal of Computational and Graphical Statistics 24 (1): 44–65. https://doi.org/10.1080/10618600.2014.907095.