Alber, Maximilian, Sebastian Lapuschkin, Philipp Seegerer, Miriam Hägele, Kristof T. Schütt, Grégoire Montavon, Wojciech Samek, Klaus-Robert Müller, Sven Dähne, and Pieter-Jan Kindermans. 2018. “INNvestigate Neural Networks!”

Allaire, JJ, and François Chollet. 2019. Keras: R Interface to ’Keras’.

Allison, P. 2014. “Measures of fit for logistic regression.” In Proceedings of the Sas Global Forum 2014 Conference. Cary, NC: SAS Institute Inc.

Alvarez-Melis, David, and Tommi S. Jaakkola. 2018. “On the Robustness of Interpretability Methods.” arXiv E-Prints, June, arXiv:1806.08049.

Apley, Dan. 2018. ALEPlot: Accumulated Local Effects (Ale) Plots and Partial Dependence (Pd) Plots.

Apley, D. W., and J. Zhu. 2019. “Visualizing the effects of predictor variables in black box supervised learning models” abs/1604.00825.

Azure. 2019. “Microsoft Cognitive Services.”

Bach, Sebastian, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. 2015. “On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation.” Edited by Oscar Deniz Suarez. Plos One 10 (7): e0130140.

Bastian, Lori A., and Joanne T. Piscitelli. 1997. “Is This Patient Pregnant?: Can You Reliably Rule In or Rule Out Early Pregnancy by Clinical Examination?” JAMA 278 (7): 586–91.

Berrar, D. 2019. “Performance measures for binary classification.” In Encyclopedia of Bioinformatics and Computational Biology Volume 1, 546–60. Elsevier.

Biecek, Przemyslaw. 2018. DALEX: Explainers for Complex Predictive Models in R. Journal of Machine Learning Research. Vol. 19.

———. 2019. “Model Development Process.” CoRR abs/1907.04461.

Biecek, Przemyslaw, Hubert Baniecki, Adam Izdebski, and Katarzyna Pekala. 2019. ingredients: Effects and Importances of Model Ingredients.

Biecek, Przemyslaw, and Marcin Kosinski. 2017. “archivist: An R Package for Managing, Recording and Restoring Data Analysis Results.” Journal of Statistical Software 82 (11): 1–28.

Binder, Alexander, Grégoire Montavon, Sebastian Bach, Klaus-Robert Müller, and Wojciech Samek. 2016. “Layer-Wise Relevance Propagation for Neural Networks with Local Renormalization Layers.” CoRR abs/1604.00825.

Bischl, Bernd, Michel Lang, Lars Kotthoff, Julia Schiffner, Jakob Richter, Erich Studerus, Giuseppe Casalicchio, and Zachary M. Jones. 2016. “mlr: Machine Learning in R.” Journal of Machine Learning Research 17 (170): 1–5.

Boehm, Barry. 1988. A Spiral Model of Software Development and Enhancement. IEEE Computer, IEEE, 21(5):61-72.

Breiman, Leo. 2001a. “Random Forests.” In Machine Learning, 45:5–32.

———. 2001b. “Statistical Modeling: The Two Cultures.” Statistical Science 16 (3): 199–231.

Breiman, Leo, Adele Cutler, Andy Liaw, and Matthew Wiener. 2018. RandomForest: Breiman and Cutler’s Random Forests for Classification and Regression.

Breiman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classification and Regression Trees. Monterey, CA: Wadsworth; Brooks.

Brentnall, A. R., and J. Cuzick. 2018. “Use of the concoordance index for predictors of censored survival data.” Statistical Methods in Medical Research 27: 2359–73.

Buuren, S. van. 2012. Flexible Imputation of Missing Data. Boca Raton, FL: Chapman & Hall/CRC.

Casey, Bryan, Ashkon Farhangi, and Roland Vogl. 2018. “Rethinking Explainable Machines: The Gdpr’s ’Right to Explanation’ Debate and the Rise of Algorithmic Audits in Enterprise.” Berkeley Technology Law Journal.

Chapman, Pete, Julian Clinton, Randy Kerber, Thomas Khabaza, Thomas Reinartz, Colin Shearer, and Rudiger Wirth. 1999. The CRISP-DM 1.0 Step-by-step data mining guide.

Chen, Tianqi, and Carlos Guestrin. 2016. “XGBoost: A Scalable Tree Boosting System.” In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, 785–94. KDD ’16. New York, NY, USA: ACM.

Chollet, François, and others. 2015. “Keras.”; GitHub.

Cortes, Corinna, and Vladimir Vapnik. 1995. “Support-Vector Networks.” In Machine Learning, 273–97.

Dastin, Jeffrey. 2018. “Amazon Scraps Secret Ai Recruiting Tool That Showed Bias Against Women.” Reuters.

Deng, J., W. Dong, R. Socher, L. Li, Kai Li, and Li Fei-Fei. 2009. “ImageNet: A large-scale hierarchical image database.” In 2009 Ieee Conference on Computer Vision and Pattern Recognition, 248–55.

Diaz, Mark, Isaac Johnson, Amanda Lazar, Anne Marie Piper, and Darren Gergle. 2018. “Addressing Age-Related Bias in Sentiment Analysis.” In Proceedings of the 2018 Chi Conference on Human Factors in Computing Systems, 412:1–412:14. Chi ’18. New York, NY, USA: Acm.

Dobson, A. J. 2002. Introduction to Generalized Linear Models (2nd Ed.). Boca Raton, FL: Chapman & Hall/CRC.

Donizy, Piotr, Przemyslaw Biecek, Agnieszka Halon, and Rafal Matkowski. 2016. “BILLCD8 – a Multivariable Survival Model as a Simple and Clinically Useful Prognostic Tool to Identify High-Risk Cutaneous Melanoma Patients” 36 (September): 4739–48.

Dorogush, Anna Veronika, Vasily Ershov, and Andrey Gulin. 2018. “CatBoost: gradient boosting with categorical features support.” CoRR abs/1810.11363.

Duffy, Clare. 2019. “Apple Co-Founder Steve Wozniak Says Apple Card Discriminated Against His Wife.” CNN Business.

Efron, Bradley, and Trevor Hastie. 2016. Computer Age Statistical Inference: Algorithms, Evidence, and Data Science. 1st ed. New York, NY, USA: Cambridge University Press.

Ehrlinger, John. 2016. “ggRandomForests: Exploring Random Forest Survival.”

Faraway, Julian. 2002. Practical Regression and Anova Using R.

Fisher, Aaron, Cynthia Rudin, and Francesca Dominici. 2018. “Model Class Reliance: Variable Importance Measures for Any Machine Learning Model Class, from the ’Rashomon’ Perspective.” Journal of Computational and Graphical Statistics.

Foster, David. 2017. XgboostExplainer: An R Package That Makes Xgboost Models Fully Interpretable.

Friedman, Jerome H. 2000. “Greedy Function Approximation: A Gradient Boosting Machine.” Annals of Statistics 29: 1189–1232.

Galecki, A., and T. Burzykowski. 2013. Linear Mixed-Effects Models Using R: A Step-by-Step Approach. Springer Publishing Company, Incorporated.

Gdpr. 2018. “The Eu General Data Protection Regulation (Gdpr) Is the Most Important Change in Data Privacy Regulation in 20 Years.”

Goldstein, Alex, Adam Kapelner, Justin Bleich, and Emil Pitkin. 2015. “Peeking Inside the Black Box: Visualizing Statistical Learning with Plots of Individual Conditional Expectation.” Journal of Computational and Graphical Statistics 24 (1): 44–65.

Goodman, Bryce, and Seth Flaxman. 2016. “European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation".” Arxiv.

Gosiewska, Alicja, and Przemyslaw Biecek. 2018. auditor: Model Audit - Verification, Validation, and Error Analysis.

———. 2019. “iBreakDown: Uncertainty of Model Explanations for Non-additive Predictive Models.”

Greenwell, Brandon. 2020. fastshap: Fast Approximate Shapley Values.

Greenwell, Brandon M. 2017. “pdp: An R Package for Constructing Partial Dependence Plots.” The R Journal 9 (1): 421–36.

Grolemund, Garrett, and Hadley Wickham. 2019. R for Data Science.

Hall, Patrick. 2019. On Explainable Machine Learning Misconceptions and a More Human-Centered Machine Learning.

Harrell, F. E. Jr. 2015. Regression Modeling Strategies. 2nd ed. Cham, Switzerland: Springer.

Harrell, F. E. Jr., K. L. Lee, and D. B. Mark. 1996. “Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.” Statistics in Medicine 15: 361–87.

Harrell Jr, Frank E. 2018. Rms: Regression Modeling Strategies.

Hastie, T., R. Tibshirani, and J. Friedman. 2009. The Elements of Statistical Learning. Data Mining, Inference, and Prediction. 2nd ed. New York, NY, USA: Springer.

Hochreiter, Sepp, and Jürgen Schmidhuber. 1997. “Long Short-Term Memory.” Neural Computation 9 (8): 1735–80.

Hoover, Benjamin, Hendrik Strobelt, and Sebastian Gehrmann. 2019. “ExBERT: A Visual Analysis Tool to Explore Learned Representations in Transformers Models.”

Hothorn, Torsten, Kurt Hornik, and Achim Zeileis. 2006. “Unbiased Recursive Partitioning: A Conditional Inference Framework.” Journal of Computational and Graphical Statistics 15 (3): 651–74.

Jacobson, Ivar, Grady Booch, and James Rumbaugh. 1999. The Unified Software Development Process.

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2014. An Introduction to Statistical Learning: With Applications in R. Springer Publishing Company, Incorporated.

Jed Wing, Max Kuhn. Contributions from, Steve Weston, Andre Williams, Chris Keefer, Allan Engelhardt, Tony Cooper, Zachary Mayer, et al. 2016. Caret: Classification and Regression Training.

Jiangchun, Li. 2018. Python Partial Dependence Plot Toolbox.

Karbowiak, Ewelina, and Przemyslaw Biecek. 2019. EIX: Explain Interactions in Gradient Boosting Models.

Kruchten, Philippe. 1998. The Rational Unified Process.

Kuhn, Max, and Kjell Johnson. 2013. Applied Predictive Modeling. New York, NY: Springer.

Kuhn, Max, and Davis Vaughan. 2019. Parsnip: A Common Api to Modeling and Analysis Functions.

Kutner, M. H., C. J. Nachtsheim, J. Neter, and W. Li. 2005. Applied Linear Statistical Models. New York: McGraw-Hill/Irwin.

Landram, F., A. Abdullat, and V. Shah. 2005. “The coefficient of prediction for model specification.” Southwestern Economic Review 32: 149–56.

Larson, Jeff, Surya Mattu, Lauren Kirchner, and Julia Angwin. 2016. “How We Analyzed the Compas Recidivism Algorithm.” ProPublica.

Lazer, David, Ryan Kennedy, Gary King, and Alessandro Vespignani. 2014. “The Parable of Google Flu: Traps in Big Data Analysis.” Science 343 (6176): 1203–5.

LeDell, Erin, Navdeep Gill, Spencer Aiello, Anqi Fu, Arno Candel, Cliff Click, Tom Kraljevic, et al. 2019. H2o: R Interface for ’H2o’.

Liaw, Andy, and Matthew Wiener. 2002. “Classification and Regression by randomForest.” R News 2 (3): 18–22.

Little, R. J. A., and D. B. Rubin. 2002. Statistical Analysis with Missing Data (2nd Ed.). Hoboken, NJ: Wiley.

Lundberg, Scott. 2019. SHAP (SHapley Additive exPlanations).

Lundberg, Scott M., Gabriel G. Erion, and Su-In Lee. 2018. “Consistent Individualized Feature Attribution for Tree Ensembles.” CoRR abs/1802.03888.

Lundberg, Scott M, and Su-In Lee. 2017. “A Unified Approach to Interpreting Model Predictions.” In Advances in Neural Information Processing Systems 30, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 4765–74. Curran Associates, Inc.

Maksymiuk, Szymon, Alicja Gosiewska, and Przemyslaw Biecek. 2019. shapper: Wrapper of Python Library ’shap’.

Max, Kuhn, and Hadley Wickham. 2018. Tidymodels: Easily Install and Load the ’Tidymodels’ Packages.

Meyer, David, Evgenia Dimitriadou, Kurt Hornik, Andreas Weingessel, and Friedrich Leisch. 2019. E1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), Tu Wien.

Molenberghs, G., and M. G. Kenward. 2007. Missing Data in Clinical Studies. Chichester, England: Wiley.

Molnar, Christoph. 2019. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable.

Molnar, Christoph, Bernd Bischl, and Giuseppe Casalicchio. 2018. “iml: An R package for Interpretable Machine Learning.” Joss 3 (26): 786.

Nagelkerke, N. J. D. 1991. “A note on a general definition of the coefficient of determination.” Biometrika 78: 691–92.

Nolan, Deborah, and Duncan Temple Lang. 2015. Data Science in R: A Case Studies Approach to Computational Reasoning and Problem Solving. Chapman & Hall/CRC.

O’Connell, Mark, Catherine Hurley, and Katarina Domijan. 2017. “Conditional Visualization for Statistical Models: An Introduction to the Condvis Package in R.” Journal of Statistical Software, Articles 81 (5): 1–20.

Olhede, S., and P. Wolfe. 2018. “The AI spring of 2018.” Significance, May.

O’Neil, Cathy. 2016. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. New York, NY, USA: Crown Publishing Group.

Paluszynska, Aleksandra, and Przemyslaw Biecek. 2017. RandomForestExplainer: A Set of Tools to Understand What Is Happening Inside a Random Forest.

Pedersen, Thomas Lin, and Michaël Benesty. 2019. lime: Local Interpretable Model-Agnostic Explanations.

Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, et al. 2011. “Scikit-Learn: Machine Learning in Python.” Journal of Machine Learning Research 12: 2825–30.

R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.

Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016. “Why Should I Trust You?: Explaining the Predictions of Any Classifier.” In, 1135–44. ACM Press.

Ridgeway, Greg. 2017. Gbm: Generalized Boosted Regression Models.

Robnik-Šikonja, Marco, and Igor Kononenko. 2008. “Explaining Classifications for Individual Instances.” IEEE Transactions on Knowledge and Data Engineering 20 (5): 589–600.

Robnik-Šikonja, Marko. 2018. ExplainPrediction: Explanation of Predictions for Classification and Regression Models.

Ross, Casey, and Ike Swetliz. 2018. “IBM’s Watson Supercomputer Recommended ‘Unsafe and Incorrect’ Cancer Treatments, Internal Documents Show.” Statnews.

Rufibach, K. 2010. “Use of Brier score to assess binary predictions.” Journal of Clinical Epidemiology 63: 938–39.

Ruiz, Javier. 2018. “Machine Learning and the Right to Explanation in Gdpr.”

Salzberg, Steven. 2014. “Why Google Flu Is A Failure.” Forbes.

Samek, Wojciech, Thomas Wiegand, and Klaus-Robert Müller. 2017. “Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models.”

Schafer, J. L. 1997. Analysis of Incomplete Multivariate Data. Boca Raton, FL: Chapman & Hall/CRC.

Shapley, Lloyd S. 1953. “A Value for n-Person Games.” In Contributions to the Theory of Games Ii, edited by Harold W. Kuhn and Albert W. Tucker, 307–17. Princeton: Princeton University Press.

Sheather, Simon. 2009. A Modern Approach to Regression with R. Springer Texts in Statistics. Springer New York.

Shmueli, G. 2010. “To explain or to predict?” Statistical Science 25: 289–310.

Shrikumar, Avanti, Peyton Greenside, and Anshul Kundaje. 2017. “Learning Important Features Through Propagating Activation Differences.” CoRR abs/1704.02685.

Simonyan, Karen, Andrea Vedaldi, and Andrew Zisserman. 2013. “Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps.” CoRR abs/1312.6034.

Simonyan, Karen, and Andrew Zisserman. 2015. “Very Deep Convolutional Networks for Large-Scale Image Recognition.” In International Conference on Learning Representations.

Sing, T., O. Sander, N. Beerenwinkel, and T. Lengauer. 2005. “ROCR: visualizing classifier performance in R.” Bioinformatics 21 (20): 7881.

Sokolva, M., and G. Lapalme. 2009. “A systematic analysis of performance measures for classification tasks.” Information Processing and Management 45: 427–37.

Staniak, Mateusz, Przemyslaw Biecek, Krystian Igras, and Alicja Gosiewska. 2019. localModel: LIME-Based Explanations with Interpretable Inputs Based on Ceteris Paribus Profiles.

Steyerberg, E. W. 2019. Clinical Prediction Models. A Practical Approach to Development, Validation, and Updating. 2nd ed. Cham, Switzerland: Springer.

Steyerberg, E. W., A. J. Vickers, N. R. Cook, T. Gerds, M. Gonen, N. Obuchowski, M. J. Pencina, and M. W. Kattan. 2010. “Assessing the performance of prediction models: a framework for traditional and novel measures.” Epidemiology 21: 128–38.

Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. 2014. “Sequence to Sequence Learning with Neural Networks.” CoRR abs/1409.3215.

Štrumbelj, Erik, and Igor Kononenko. 2010. “An Efficient Explanation of Individual Classifications Using Game Theory.” Journal of Machine Learning Research 11 (March): 1–18.

Štrumbelj, Erik, and Igor Kononenko. 2014. “Explaining prediction models and individual predictions with feature contributions.” Knowledge and Information Systems 41 (3): 647–65.

Tibshirani, Robert. 1994. “Regression Shrinkage and Selection Via the Lasso.” Journal of the Royal Statistical Society, Series B 58: 267–88.

Todeschini, Roberto. 2020. “Useful and unuseful summaries of regression models.”

Tsoumakas, G., I. Katakis, and I. Vlahavas. 2010. “Mining multi-label data.” In Data Mining and Knowledge Discovery Handbook, 2nd Ed., 667–85. Boston, MA: Springer.

Tufte, Edward R. 1986. The Visual Display of Quantitative Information. Cheshire, CT, USA: Graphics Press.

Tukey, John W. 1977. Exploratory Data Analysis. Addison-Wesley.

van Houwelingen, H.C. 2000. “Validation, calibration, revision and combination of prognostic survival models.” Statistics in Medicine 19: 3401–15.

Van Rossum, Guido, and Fred L. Drake. 2009. Python 3 Reference Manual. Scotts Valley, CA: CreateSpace.

Venables, W. N., and B. D. Ripley. 2002. Modern Applied Statistics with S. Fourth. New York: Springer.

Wes, McKinney. 2012. Python for Data Analysis. 1st ed. O’Reilly Media, Inc.

Wickham, Hadley. 2009. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York.

Wickham, Hadley, and Garrett Grolemund. 2017. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. 1st ed. O’Reilly Media, Inc.

Wikipedia. 2019. CRISP DM: Cross-industry standard process for data mining.

Wright, Marvin N., and Andreas Ziegler. 2017. ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R. Journal of Statistical Software. Vol. 77.

Xie, Yihui. 2018. bookdown: Authoring Books and Technical Documents with R Markdown.