# Chapter 1 Introduction

A note to readers: this text is a work in progress.

We’ve released this initial version to get more feedback. Feedback can be given at the GitHub repo https://github.com/pbiecek/PM_VEE/issues. Copyediting has not been done yet so read at your own risk.

We are primarily interested in the organization and consistency of the content, but any comments will be welcommed.

Thanks for taking the time to read this.

We’d like to thank everyone that contributed feedback, typos, or discussions while the book was being written. GitHub contributors included, agosiewska.

## 1.2 The aim of the book

Predictive models are used to guess (statisticians would say: predict) values of a variable of interest based on other variables. As an example, consider prediction of sales based on historical data, prediction of risk of heart disease based on patient’s characteristics, or prediction of political attitudes based on Facebook comments.

Predictive models have been constructed through the enitre human history. Ancient Egyptians, for instance, used observations of rising of Sirius to predict flooding of the Nile. A more rigorous approach to model construction may be attributed to the method of least squares, published more than two centuries ago by Legendre in 1805 and by Gauss in 1809. With time, the number of applications in economy, medicine, biology,and agriculture was growing. The term regression was coined by Francis Galton in 1886. Initially, it was referring to biological applications, while today it is used for various models that allow prediction of continuous variables. Prediction of nominal variables is called classification, and its beginning may be attributed to works of Ronald Fisher in 1936.

During the last century, many statistical models that can be used for predictive purposes have been developed. These include linear models, generalized linear models, regression and classification trees, rule-based models, and many others. Developments in mathematical foundations of predictive models were boosted by increasing computational power of personal computers and availability of large datasets in the era of ,,big data’’ that we have entered.

With the increasing demand for predictive models, model features such as flexibility, ability to perform internally variable selection (feature engineering), and high precision of predictions are of interest. To obtain robust models, ensembles of models are used. Techniques like bagging, boosting, or model stacking combine hundreds or thousands of small models into a one super-model. Large deep neural models have over a bilion of parameters.

There is a cost of this progress. Complex models may seem to operate like ,,black boxes’‘. It may be difficult, or even impossible, to understand how thousands of coefficients affect the model prediction. At the same time, complex models may not work as good as we would like them to do. An overview of real problems with large black-box models may be found in an excellent book of Cathy O’Neil (O’Neil 2016) or in her TED Talk ,,The era of blind faith in big data must end’’. There is a growing number of examples of predictive models with performance that deteriorated over time or became biased in some sense. For instance, IBM’s Watson for Oncology was criticized by oncologists for delivering unsafe and inaccurate recommendations (Ross and Swetliz 2018). Amazon’s system for CV screening was found to be biased against women (Dastin 2018). The COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) algorithm for predicting recidivism, developed by Northpointe (now Equivant), discriminated against race (Larson et al. 2016). These are examples of models and algorithms that led to serious violations of fairness and ethical principles. An example of situation when data drift led to deterioration in model’s performance is the Google Flu model, which gave worse predictions after two years than at baseline (Salzberg 2014),[Lazer et al Science 2014].

A reaction to some of these examples and problems are new regulations, like the General Data Protection Regulation (GDPR 2018). Also, new civic rights are being formulated (Goodman and Flaxman 2016),(Casey, Farhangi, and Vogl 2018),(Ruiz 2018). A noteworthy example is the ,,Right to Explanation’’, i.e., the right to be provided an explanation for an output of an automated algorithm (Goodman and Flaxman 2016). To exercise the right, methods for verification, exploration, and explanation of predictive models are needed.

We can conclude that, today, the true bottleneck in predictive modelling is not the lack of data, nor the lack of computational power, nor the lack of flexible models. It is the lack of tools for model validation, model exploration, and explanation of model decisions. Thus, in this book, we present a collection of methods that may be used for this purpose. As development of such methods is a very active area of research and new methods become available almost on a continuous basis, we do not aim at being exhaustive. Rather, we present the mind-set, key problems, and several examples of methods that can be used in model exploration.

## 1.3 A bit of philosophy: three laws of model explanation

Seventy-six years ago, Isaac Asimov forumlated Three Laws of Robotics: 1) a robot may not injure a human being, 2) a robot must obey the orders given it by human beings, and 3) a robot must protect its own existence.

Today’s robots, like cleaning robots, robotic pets, or autonomous cars are far from being conscious enough to fall under Asimov’s ethics. However, we are more and more surrounded by complex predictive models and algoritmhs used for decision making. Machine-learning models are used in health care, politics, education, justice, and many other areas. The models and algorithms have a far larger influence on our lives than physical robots. Yet, applications of such models are left unregulated despite examples of their potential harmfulness. See Weapons of Math Destruction by Cathy O’Neil (O’Neil 2016) for an excellent overview of selected problems.

It’s clear that some we need to control the models and algorithms that may affect us. Thus, Asimov’s laws are referred to in the context of the discussion around Ethics of Artifical Intelligence. Initiatives to formulate principles for the AI development have been undertaken, for instance, in the UK [Olhede & Wolfe, Significance 2018, 15: 6-7]. Following Asimov’s approach, we could propose three requirements that any predictive model should fulfill:

• Prediction’s justification. For every prediction of a model, one should be able to understand which variables affect the prediction and to which extent.
• Prediction’s speculation. For every prediction of a model, one should be able to understand how the model prediction would change if input variables changed.
• Prediction’s validation For every prediction of a model, one should be able to verify how strong is the evidence that confirms this particular prediction.

We see two ways to comply with these requirements. One is to use only models that fulfill these conditions by design. However, the price for transparency may be a reduction in performance. Another way is to use tools that allow, perhaps by using approximations, to ,,explain’’ predictions for any model. In our book, we will focus on the latter.

## 1.4 Terminology

It is worth noting that, when it comes to predictive models, the same concepts have often been given different names in statistics and in machine learning. For instance, in the statistical-modelling literature, one refers to ,,explanatory variables,’’ with ,,independent variables,’’ ,,predictors,’’ or ,,covariates’’ as often-used equivalents. Explanatory variables are used in the model as means to explain (predict) the ,,dependent variable,’’ also called ,,predicted’’ variable or ,,response.’’ In the machine-learning language, ,,input variables’’ or ,,features’’ are used to predict the ,,output’’ variable. In statistical modelling, models are fit to the data that contain ,,observations,’’ whereas in the machine-learning world a dataset may contain ,,instances.’’

To the extent possible, in our book we try to consistently use the statistical-modelling terminology. However, the reader may expect references to a ,,feature’’ here and there. Somewhat inconsistently, we also introduce the term ,,instance-level’’ explanation. Instance-level explanation methods are designed to extract information about the behavior of the model related to a specific observation (or instance). On the other hand, ,,global’’ explanation techniques allow obtaining information about the behavior of the model for an entire dataset.

We consider models for dependent variables that can be continuous or nominal. The values of a continuous variable can be represented by numbers with an ordering that makes some sense (zip codes or phone numbers are not considered as continuous variables). A continuous variable does not have to be continuous in the mathematical sense; counts (number of floors, steps, etc.) will be treated as continuous variables as well. A nominal variable can assume only a finite set of values that cannot be given numeric values.

In this book we focus on ,,black-box’’ models. We discuss them in a bit more detail in the next section.

## 1.5 White-box models vs. black-box models

Black-box models are models with a complex structure that is hard to understand by humans. Usually this refers to a large number of model coefficients. As humans may vary in their capacity of understanding complex models, there is no strict threshold for the number of coefficients that makes a model a black-box. In practice, for most humans this threshold is probably closer to 10 than to 100.

A ,,white-box’’ model, which is opposite to a ,,black-box’’ one, is a model that is easy to understand by a human (though maybe not by every human). It has got a simple structure and a limited number of coefficients. The two most common classess of white-box models are decision or regression trees, as an example in Figure 1.2, or models with an additive structure, like the following model for mortality risk in melanoma patients:

$RelativeRisk = 1 + 3.6 * [Breslow > 2] - 2 * [TILs > 0]$

In the model, two explanatory variables are used: an indicator whether the thickness of the lesion according to the Breslow scale is larger than 2 mm and an indicator whether the percentage of tumor-infiltrating lymphocytes (TILs) was larger than 0.

The structure of a white box-model is, in general, easy to understand. It may be difficult to collect the necessary data, build the model, fit it to the data, and/or perform model validation, but once the model has been developed its interpretation and mode of working is straightforward.

Why is it important to understand the model structure? There are several important advantages. If the model structure is clear, we can easily see which variables are included in the model and which are not. Hence, we may be able to, for instance, question the model when a particular explanatory variable was excluded from it. Also, in case of a model with a clear structure and a limited number of coefficients, we can easily link changes in model predictions with changes in particular explanatory variables. This, in turn, may allow us to challenge the model against the domain knowledge if, for instance, the effect of a particular variable on predictions is inconsistent with the previously established results. Note that linking changes in model predictions with changes in particular explanatory variables may be difficult when there are many variables and/or coefficients in the model. For instance, a classification tree with hundreds of nodes is difficult to understand, as is a linear regression model with hundreds of cofficients.

Getting the idea about the performance of a black-box model may be more challenging. The structure of a complex model like, e.g., a neural-network model, mmay be far from transparent. Consequently, we may not understand which features and how influence the model decisions. Consequently, it may be difficult to decide whether the model is consistent with the domain knowledge. In our book we present tools that can help in extracting the information necessary for the model evaluation for complex models.

## 1.6 Model visualization, exploration, and explanation

The lifecycle of a model can be divided, in general, in three different phases: development (or building), deployment, and maintenance.

Model development is the phase in which one is looking for the best available model. During this process, model exploration tools are useful. Exploration involves evaluation of the fit of the model, verification of the assumptions underlying the model (diagnostics), and assessment of the predictive performance of the model (validation). In our book we will focus on the visualization tools that can be useful in model exploration. We will not, however, discuss visualization methods for diagnostic purposes, as they are extensively discussed in many books devoted to statistical modelling.

Model deployment is the phase in which a predictive model is adopted for use. In this phase, it is crucial that the users gain confidence in using the model. It is worth noting that the users might not have been involved in the model development. Moreover, they may only have got access to the software implementing the model that may not provide any insight in the details of the model structure. In this situation, model explanation tools can help to understand the factors that influence model predictions and to gain confidence in the model. The tools are one of the main focus point of our book.

Finally, a deployed model requires maintenance. In this phase, one monitors model’s performance by, for instance, checking the validity of predictions for different datasets. If issues are detected, model explanation tools may be used to find the source of the problem and to suggest a modification of the structure of the model.

## 1.7 Model-agnostic vs. model-specific approach

Some classes of models have been developed for a long period of time or have attracted a lot of interest with an intensive research as a result. Consequently, those classes of models are equipped with very good tools for model exploration or visualisation. For example:

• There are many tools for diagnostics and evaluation of linear models. Model assumptions are formally defined (normality, linear structure, homogenous variance) and can be checked by using normality tests or plots (normal qq-plot), diagnostic plots, tests for model structure, tools for identification of outliers, etc.
• For many more advanced models with an additive structure, like the proportional hazards model, there also many tools that can be used for checking model assumptions.
• Random-forest model is equipped with the out-of-bag method of evaluation of performance and several tools for measuring variable importance (Breiman et al. 2018). Methods have been developed to extract information from the model structure about possible interactions (Paluszynska and Biecek 2017b). Similar tools have been developed for other ensembles of trees, like xgboost models (Foster 2018).
• Neural networks enjoy a large collection of dedicated model-explanation tools that use, for instance, the layer-wise relevance propagation technique (Bach et al. 2015), or saliency maps technique (Simonyan, Vedaldi, and Zisserman 2013), or a mixed approach.

Of course, the list of model classes with dedicated collections of model-explanation and/or diagnostics methods is much longer. This variety of model-specific approaches does lead to issues, though. For instance, one cannot easily compare explanations for two models with different structures. Also, every time when a new architecture or a new ensemble of models is proposed, one needs to look for new methods of model exploration. Finally, for brand-new models no tools for model explanation or diagnostics may be immedaitely available.

For these reasons, in our book we focus on model-agnostic techniques. In particular, we prefer not to assume anything about the model structure, as we may be dealing with a black-box model with an unclear structure. In that case, the only operation that we may be able to perform is evaluation of a model for a selected observation.

However, while we do not assume anything about the structure of the model, we will assume that the model operates on $$p$$-dimensional vectors and, for a single vector, it returns a single value which is a real number. This assumption holds for a broad range of models for data such as tabular data, images, text data, videos, etc. It may not be suitable for, e.g., models with memory like seq2seq models (Sutskever, Vinyals, and Le 2014) or Long Short Term Memory models (Hochreiter and Schmidhuber 1997) in which the model output depends also on sequence of previous inputs.

## 1.8 Notation

Methods described in this book were developed by different authors, who used different mathematical notations. We try to keep the mathematical notation consistent throughout the entre book. In some cases this may result in formulae wth a fairly complex system of indices.

In this section, we provide a general overview of the notation we use. Whenever necessary, parts of the notation will be explained again in subsequent chapters.

We consider predictive models that operate on a $$p$$-dimensional input space $$\mathcal X$$. By $$x \in \mathcal X$$ we will refer to a single point in this input space.

In some cases models are described in context of a dataset with $$n$$ observations. By $$x_i$$ we refer to the $$i$$-th observation in this dataset. Of course, $$x_i \in \mathcal X$$.

Some explainers are constructed around an observation of interest which will be denoted by $$x_{*}$$. The observation may not necessarily belong to the analyzed dataset; hence, the use of the asterisk in the index. Of course, $$x_* \in \mathcal X$$.

Points in $$\mathcal X$$ are $$p$$ dimensional vectors. We will refer to the $$j$$-th coordinate by using $$j$$ in superscript. Thus, $$x^j_i$$ deontes the $$j$$-th coordinate of the $$i$$-th observation from the analyzed dataset. If $$\mathcal J$$ denotes a subset of indices, then $$x^{\mathcal J}$$ denotes the elements of vector $$x$$ corresponding to the indices included in $$\mathcal J$$.

We will use the notation $$x^{-j}$$ to refer to a vector that results from removing the $$j$$-th coordinate from vector $$x$$. By $$x^{j|=z}$$, we denote a vector with the values at all coordinates equal to the values in $$x$$, except of the $$j$$-th coordinate, which is set equal to $$z$$. So, if $$w=x^{j|=z}$$, then $$w^j = z$$ and $$\forall_{k\neq j} w^k = x^k$$.

In this book, a model is a function $$f:\mathcal X \rightarrow y$$ that transforms a point from $$\mathcal X$$ into a real number. In most cases, the presented methods can be used directly for multi-variate dependent variables; however, we use examples with uni-variate responses to simplify the notation.

We will use $$r_i = y_i - f(x_i)$$ we refer to the model residual, i.e., the difference between the observed value of the dependent variable $$Y$$ for the $$i$$-th observation from a particular dataset and the model prediction for the observaton.

## 1.9 The structure of the book

Our book is split in two parts. In the part Instance-level explainers, we present techniques for exploration and explanation of model predictions for a single observation. On the other hand, in the part Global explainers, we present techniques for exploration and explanation of model’s performance for an entire dataset.

Before embarking on the description of the methods, in Chapter 2, we provide a short description of R tools and packages that are necessary to replicate the results presented for various methods. In Chapter 4, we describe three datasets that are used throughout the book to illustrate the presented methods and tools.

The Instance-level explainers part of the book consists of Chapters 6-13. In Chapters 6-8, methods based on Ceteris-paribus (CP) profiles are presented. The profiles show the change of model-based predictions induced by a change of a single variable; they are introduced in Chapter 6. Chapter 7 presents a CP-profile-based measure that summarizes the impact of a selected variable on model’s predictions. The measure can be used to select the profiles that are worth plotting for a model with a large number of explanatory variables. Chapter 8 describes local-fidelity plots that are useful to investigate the sources of a poor prediction for a particular single observation.

Chapters 9-11 present methods to decompose variable contributions to model predictions. In particular, Chapter 9 introduces Break-down (BD) plots for models with additive effects. On the other hand, Chapter 10 presents a method for models including interactions. Finally, Chapter 11 describes an alternative method for decomposing model predictions that is closely linked with Shapley values (Shapley 1953) developed originally for cooperative games.

Chapter 12 presens a different approach to explanation of single-instance predictions. It is based on a local approximation of a black-box model by a simpler, white-box one. In paricular, in the chapter, the Local Interpretable Model-Agnostic Explanations (LIME) method (Ribeiro, Singh, and Guestrin 2016) is discussed.

The final chapter of the first part, Chapter 13, presence a comparison of various instance-level explainers.

The Global explainers part of the book consists of Chapters 14-26.

In each part, every method is described in a separate chapter that has got the same structure: * Subsection Introduction explains the goal of and the general idea behind the method. * Subsection Method shows mathematical or computational details related to the method. This subsection can be skipped if you are not interested in the details. * Subsection Example shows an exemplary application of the method with discussion of results. * Subsection Pros and cons summarizes the advantages and disadvantages of the method. It also provides some guideance regarding when to use the method. * Subsection Code snippets shows the implementation of the method in R and Python. This subsection can be skipped if you are not interested in the implementation.

Finally, we would like to signal that, in this book, we do show

• how to determine features that affect model prediction for a single observation. In particular, we present the theory and examples of methods that can be used to explain prediction like break down plots, ceteris paribus profiles, local-model approximations, or Shapley values.
• techniques to examine fully-trained machine-learning models as a whole. In particular, we review the theory and examples of methods that can be used to explain model performance globally, like partial-dependency plots, variable-importance plots, and others.
• charts that can be used to present key information in a quick way.
• tools and methods for model comparison.
• code snippets for R and Python that explain how to use the described methods.

On the other hand, in this book, we do not focus on

• any specific model. The presented techniques are model agnostic and do not make any assumptions related to model structure.
• data exploration. There are very good books on this topic, like R for Data Science http://r4ds.had.co.nz/ or TODO
• the process of model building. There are also very good books on this topic, see An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani http://www-bcf.usc.edu/~gareth/ISL/ or TODO
• any particular tools for model building. These are discussed, for instance, in Applied Predictive Modeling By Max Kuhn and Kjell Johnson http://appliedpredictivemodeling.com/

## 1.10 Acknowledgements

Przemek’s work on interpretability has started during research trips within the RENOIR project (691152 - H2020/2016-2019). So he would like to thank prof. Janusz Holyst for the chance to take part in this project.

Przemek would also like thank prof. Chris Drake for her hospitality. This book would have never been created without perfect conditions that Przemek found at Chris’ house in Woodland.

This book has been prepared by using the bookdown package (Xie 2018), created thanks to the amazing work of Yihui Xie.

### References

O’Neil, Cathy. 2016. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. New York, NY, USA: Crown Publishing Group.

Ross, Casey, and Ike Swetliz. 2018. “IBM’s Watson Supercomputer Recommended ‘Unsafe and Incorrect’ Cancer Treatments, Internal Documents Show.” Statnews. https://www.statnews.com/2018/07/25/ibm-watson-recommended-unsafe-incorrect-treatments/.

Dastin, Jeffrey. 2018. “Amazon Scraps Secret Ai Recruiting Tool That Showed Bias Against Women.” Reuters. https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazonscraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G.

Larson, Jeff, Surya Mattu, Lauren Kirchner, and Julia Angwin. 2016. “How We Analyzed the Compas Recidivism Algorithm.” ProPublica. https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm.

GDPR. 2018. “The Eu General Data Protection Regulation (Gdpr) Is the Most Important Change in Data Privacy Regulation in 20 Years.” https://eugdpr.org/.

Goodman, Bryce, and Seth Flaxman. 2016. “European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation".” Arxiv. https://arxiv.org/abs/1606.08813.

Casey, Bryan, Ashkon Farhangi, and Roland Vogl. 2018. “Rethinking Explainable Machines: The Gdpr’s ’Right to Explanation’ Debate and the Rise of Algorithmic Audits in Enterprise.” Berkeley Technology Law Journal. https://ssrn.com/abstract=3143325.

Ruiz, Javier. 2018. “Machine Learning and the Right to Explanation in Gdpr.” https://www.openrightsgroup.org/blog/2018/machine-learning-and-the-right-to-explanation-in-gdpr.

Breiman, Leo, Adele Cutler, Andy Liaw, and Matthew Wiener. 2018. RandomForest: Breiman and Cutler’s Random Forests for Classification and Regression. https://CRAN.R-project.org/package=randomForest.

Paluszynska, Aleksandra, and Przemyslaw Biecek. 2017b. RandomForestExplainer: Explaining and Visualizing Random Forests in Terms of Variable Importance. https://CRAN.R-project.org/package=randomForestExplainer.

Foster, David. 2018. XgboostExplainer: XGBoost Model Explainer.

Bach, Sebastian, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. 2015. “On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation.” Edited by Oscar Deniz Suarez. PLOS ONE 10 (7): e0130140. https://doi.org/10.1371/journal.pone.0130140.

Simonyan, Karen, Andrea Vedaldi, and Andrew Zisserman. 2013. “Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps.” CoRR abs/1312.6034. http://arxiv.org/abs/1312.6034.

Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. 2014. “Sequence to Sequence Learning with Neural Networks.” CoRR abs/1409.3215. http://arxiv.org/abs/1409.3215.

Hochreiter, Sepp, and Jürgen Schmidhuber. 1997. “Long Short-Term Memory.” Neural Computation 9 (8): 1735–80. https://doi.org/10.1162/neco.1997.9.8.1735.

Shapley, Lloyd S. 1953. “A Value for N-Person Games.” In Contributions to the Theory of Games Ii, edited by Harold W. Kuhn and Albert W. Tucker, 307–17. Princeton: Princeton University Press.

Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016. “Why Should I Trust You?: Explaining the Predictions of Any Classifier.” In, 1135–44. ACM Press. https://doi.org/10.1145/2939672.2939778.

Xie, Yihui. 2018. Bookdown: Authoring Books and Technical Documents with R Markdown. https://CRAN.R-project.org/package=bookdown.