Explainable Machine Learning for Modeling of Early Postoperative Mortality in Lung Cancer

Published in Artificial Intelligence in Medicine, 2020

In recent years we see an increasing interest in applications of complex machine learning methods to medical problems. Black box models based on deep neural networks or ensembles are more and more popular in diagnostic, personalized medicine (Hamet and Tremblay 2017) or screening studies (Scheeder et al. 2018). Partially because they are accurate and easy to train. Nevertheless such models may be hard to understand and interpret. In high stake decisions, especially in medicine, the understanding of factors that drive model decisions is crucial. Lack of model understanding creates a serious risk in applications.

In our study we propose and validate new approaches to exploration and explanation of predictive models for early postoperative mortality in lung cancer patients. Models are created on the Domestic Lung Cancer Database run by the National Institute of Tuberculosis and Lung Diseases. We show how explainable machine learning techniques can be used to combine data driven signals with domain knowledge. Additionally we explore whether the insight provided by model explainers give valuable information for physicians.