survxai: an R package for structure-agnostic explanations of survival models
Published in The Journal of Open Source Software, 2018
Predictive models are widely used in supervised machine learning. Three most common classes of such models are: regression models, where the target variable is continuous numeric, classification models, where the target variable is binary or categorical and survival models, where the target is some censored variable. Common examples of censored variables are times to death (but some cases lived for x years and are still alive), cessation of service by customer, or failure of machine components. Modern survival models are often complex in structure; for example survival neural networks (Eleuteri, Tagliaferri, Milano, De Placido, & De Laurentiis, 2003) or survival random forest (Ishwaran, Kogalur, Blackstone, & Lauer, 2008). These models may be described by thousands of coefficients. Often such flexibility leads to high performance, but makes models opaque and hard to understand. This is acceptable in cases where only the model accuracy is important, but in cases that involve human decisions, it may not be informative enough. To trust model predictions one needs to see which features are important and how model predictions would change if some feature was changed. The area of model interpretability or explanability has quickly gained the attention of machine learning experts. Understanding of complex models leads not only to higher trust in model predictions but also better models. Better, means that the models are more robust and obtains higher accuracy on validation data. See examples in the DALEX (Biecek, 2018) or iml (Molnar, 2018) R packages. Existing tools for model agnostic explanations are focused on regression models and classification problems, as in both cases model predictions that may be summarised by a single number. Survival models require different approach as predictions are in a form of survival functions. Demand for such explainers has led to some model specific solutions, like iSurvive (Dempsey et al., 2017) for continuous time hidden Markov models. Yet, there is currently a lack of structure agnostic tools for survival models. The survxai fills this gap. This R package is designed to deliver local and global explanations for survival models, in a structure-agnostic fashion. In the package documentation we demonstrate examples for survival random forest models and for Cox models. The survxai package consists of new implementations and visualisations of explainers, designed for survival models. Functions are well documented and the package is supplemented with unit tests, and illustrations. Regardless of the complexity of the model, the methods implemented in the survxai package maintain a certain level of interpretability, important in medical applications (Collett, 2015), churn analysis (Lu & Park, 2003) and others