Chapter 12 Local Interpretable Model-Agnostic Explanations (LIME)

A different approach to explanations of a single instance is through surrogate models. Models that easy to understand and are similar to black box model around the instance of interest.

Variable attribution methods, that were presented in the Section 9 are not interested in the local curvature of the model. They rather compare model prediction against average model prediction and they use probability structure of the dataset.

The complementary approach would be to directly explore information about model curvature around point of interest. In the section 6 we introduced Ceteris Paribus tool for such what-if analysis. But the limitation of ceteris Paribus plots is that they explore changes along single dimension or pairs of dimensions.

In this section we describe another approach based on local approximations with white-box models. This approach will also investigate local curvature of the model but indirectly, through surrogate white-box models.

The most known method from this class if LIME (Local Interpretable Model-Agnostic Explanations), introduced in the paper Why Should I Trust You?: Explaining the Predictions of Any Classifier (Ribeiro, Singh, and Guestrin 2016). This methods and it’s clones are now implemented in various R and python packages, see for example (Pedersen and Benesty 2018), (Staniak and Biecek 2018) or (Molnar 2018).

12.1 Intuition

The intuition is presented in Figure 12.1. We want to understand some complex model as the one presented in panel B in a point makes with a black cross. As the model may be complex and describe in high dimension, we need to perform two things: find a interpretable representation of these features, fit a simple - easier to interpret model that can be used to better understand the model.

Interpretable representation is case specific, for image data one can use some super-pixels or other large chunks of the image, for tabular data it’s still not clear how to construct interpretable features.

Local model is usually a simple model like linear regression of decision tree that can be directly interpret.

(fig:LIME1) A schematic idea behind local model approximations. Panel A shows training data, colors correspond to classes. Panel B shows results from the Random Forest model, here the LIME algorithm starts. Panel C shows new data sampled around the point of interest. The color corresponds to model response. Panel D shows fitted linear model that approximated the random forest model around point of interest

Figure 12.1: (fig:LIME1) A schematic idea behind local model approximations. Panel A shows training data, colors correspond to classes. Panel B shows results from the Random Forest model, here the LIME algorithm starts. Panel C shows new data sampled around the point of interest. The color corresponds to model response. Panel D shows fitted linear model that approximated the random forest model around point of interest

12.2 Method

The LIME method, and its clones, has following properties:

  • model-agnostic, they do not imply any assumptions on model structure,
  • interpretable representation, model input is transformed into a feature space that is easier to understand. One of applications comes from image data, single pixels are not easy to interpret, thus the LIME method decompose image into a series of super pixels, that are easier to interpret to humans,
  • local fidelity means that the explanations shall be locally well fitted to the black-box model.

Therefore the objective is to find a local model \(M^L\) that approximates the black box model \(f\) in the point \(x^*\). As a solution the penalized loss function is used. The white-box model that is used for explanations satisfies following condition.

\[ M^*(x^*) = \arg \min_{g \in G} L(f, g, \Pi_{x^*}) + \Omega (g) \] where \(G\) is a family of white box models (e.g. linear models), \(\Pi_{x^*}\) is neighbourhood of \(x^*\) and \(\Omega\) stands for model complexity.

The algorithm is composed from three steps:

  • Identification of interpretable data representations,
  • Local sampling around the point of interest,
  • Fitting a white box model in this neighborhood

Identification of interpretable data representations

For image data, single pixel is not an interpretable feature. In this step the input space of the model is transformed to input space that is easier to understand for human. The image may be decomposed into parts and represented as presence/absence of some part of an image.

Local sampling around the point of interest

Once the interpretable data representation is identified, then the neighborhood around point of interest needs to be explored.

Fitting a white box model in this neighborhood

Any model that is easy to interpret may be fitted to this data, like decision tree or rule based system. However in practice the most common family of models are linear models.

12.3 Pros and cons

Local approximations are model agnostic, can be applied to any predictive model. Below we summarize key strengths and weaknesses of this approach.

Pros

  • This method is highly adopted in text analysis and image analysis, in part thanks to the interpretable data representations.
  • The intuition behind the model is straightforward.
  • Model explanations are sparse, thus only small number of features is used what makes them easier to read.

Cons

  • For continuous variables and tabular data it is not that easy to find interpretable representations. IMHO this problem is not solved yet.
  • The black-box model approximated the data and the white box model approximates the black box model. We do not have control over the quality of local fit of the white box model, thus the surrogate model may be misleading.
  • Due to the curse of dimensionality, for high dimensional space points are sparse. Measuring of being local is tricky.

12.4 Code snippets for R

In this section we present key features of the R package iBreakDown (Gosiewska and Biecek 2019a) which is a part of DrWhy.AI universe and covers methods presented in this chapter.

Note that there are also other R packages that offer similar functionality, like lime (Pedersen and Benesty 2018) which is a port of LIME python library (Lundberg 2019), live (Staniak and Biecek 2018) and localModel (Staniak and Biecek 2019) and iml (Molnar, Bischl, and Casalicchio 2018a). These packages are different in a way how they handle continuous variables (lime performs global discretization, localModel local discretization while live and iml works directly on continuous variables), what kind of local model is fit to the black-box model and how new instances are being sampled. For these reasons these packages results different explanations.

Below we present explanations returned for these four methods for the Johny D and titanic_rf_v6 model.

12.4.2 The localModel package

An example code snippet for the localModel package is presented below. Key parts are function DALEX::explain which creates an explainer and individual_surrogate_model which fits a local model.

Resulting explanations are presented in Figure 12.3.

For continuous variables localModel package discretizes features by using local Ceteris Paribus Profiles. This is why in the explanation we have age < 15. As we show in the Ceteris Paribus Chapter 6 the largest drop in survival is observed around 15 years old boys.

(fig:limeExplLocalModelTitanic) Explanations for Johny D generated by the localModel package.

Figure 12.3: (fig:limeExplLocalModelTitanic) Explanations for Johny D generated by the localModel package.

12.4.3 The iml package

An example code snippet for the iml package is presented below. Key parts are function Predictor$new which creates an explainer and LocalModel$new which fits local model.

Resulting explanations are presented in Figure 12.4.

For continuous variables like age the iml package approximates this feature without any discretization. As we showed in the Ceteris Paribus Chapter 6, the profile for age is constant in the interval 0-15, this is why here age is not an important feature.

(fig:limeExplIMLTitanic) Explanations for Johny D generated by the iml package.

Figure 12.4: (fig:limeExplIMLTitanic) Explanations for Johny D generated by the iml package.

References

Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016. “Why Should I Trust You?: Explaining the Predictions of Any Classifier.” In, 1135–44. ACM Press. https://doi.org/10.1145/2939672.2939778.

Pedersen, Thomas Lin, and Michaël Benesty. 2018. Lime: Local Interpretable Model-Agnostic Explanations. https://CRAN.R-project.org/package=lime.

Staniak, Mateusz, and Przemysław Biecek. 2018. Live: Local Interpretable (Model-Agnostic) Visual Explanations. https://CRAN.R-project.org/package=live.

Molnar, Christoph. 2018. Iml: Interpretable Machine Learning. https://CRAN.R-project.org/package=iml.

Gosiewska, Alicja, and Przemyslaw Biecek. 2019a. “iBreakDown: Uncertainty of Model Explanations for Non-additive Predictive Models.” https://arxiv.org/abs/1903.11420v1.

Lundberg, Scott. 2019. SHAP (SHapley Additive exPlanations). https://github.com/slundberg/shap.

Staniak, Mateusz, and Przemysław Biecek. 2019. LocalModel: LIME-Based Explanations with Interpretable Inputs Based on Ceteris Paribus Profiles. https://github.com/ModelOriented/localModel.

Molnar, Christoph, Bernd Bischl, and Giuseppe Casalicchio. 2018a. “Iml: An R Package for Interpretable Machine Learning.” JOSS 3 (26). Journal of Open Source Software: 786. https://doi.org/10.21105/joss.00786.