This function is set deprecated. It is suggested to use feature_importance instead. Find information how to use these functions here: https://pbiecek.github.io/PM_VEE/variableImportance.html.

variable_importance(explainer, loss_function = loss_sum_of_squares, ...,
  type = "raw", n_sample = 1000)

Arguments

explainer

a model to be explained, preprocessed by the 'explain' function

loss_function

a function thet will be used to assess variable importance

...

other parameters

type

character, type of transformation that should be applied for dropout loss. 'raw' results raw drop lossess, 'ratio' returns drop_loss/drop_loss_full_model while 'difference' returns drop_loss - drop_loss_full_model

n_sample

number of observations that should be sampled for calculation of variable importance. If negative then variable importance will be calculated on whole dataset (no sampling).

Value

An object of the class 'variable_leverage_explainer'. It's a data frame with calculated average response.

References

Predictive Models: Visual Exploration, Explanation and Debugging https://pbiecek.github.io/PM_VEE/

Examples

library("breakDown") library("randomForest") HR_rf_model <- randomForest(status == "fired"~., data = HR, ntree = 100)
#> Warning: The response has five or fewer unique values. Are you sure you want to do regression?
explainer_rf <- explain(HR_rf_model, data = HR, y = HR$status == "fired") vd_rf <- variable_importance(explainer_rf, type = "raw") vd_rf
#> variable dropout_loss label #> 1 _full_model_ 120.9083 randomForest #> 2 status 120.9083 randomForest #> 3 evaluation 136.0882 randomForest #> 4 gender 141.8207 randomForest #> 5 age 144.5874 randomForest #> 6 salary 150.5714 randomForest #> 7 hours 207.6025 randomForest #> 8 _baseline_ 277.3188 randomForest
HR_glm_model <- glm(status == "fired"~., data = HR, family = "binomial") explainer_glm <- explain(HR_glm_model, data = HR, y = HR$status == "fired") logit <- function(x) exp(x)/(1+exp(x)) vd_glm <- variable_importance(explainer_glm, type = "raw", loss_function = function(observed, predicted) sum((observed - logit(predicted))^2)) vd_glm
#> variable dropout_loss label #> 1 _full_model_ 256.2435 lm #> 2 salary 256.1909 lm #> 3 status 256.2435 lm #> 4 gender 256.2675 lm #> 5 age 256.3097 lm #> 6 evaluation 259.2151 lm #> 7 hours 279.2908 lm #> 8 _baseline_ 281.2700 lm
library("xgboost") model_martix_train <- model.matrix(status == "fired" ~ .-1, HR) data_train <- xgb.DMatrix(model_martix_train, label = HR$status == "fired") param <- list(max_depth = 2, eta = 1, silent = 1, nthread = 2, objective = "binary:logistic", eval_metric = "auc") HR_xgb_model <- xgb.train(param, data_train, nrounds = 50) explainer_xgb <- explain(HR_xgb_model, data = model_martix_train, y = HR$status == "fired", label = "xgboost") vd_xgb <- variable_importance(explainer_xgb, type = "raw") vd_xgb
#> variable dropout_loss label #> 1 _full_model_ 92.38626 xgboost #> 2 gendermale 92.38626 xgboost #> 3 evaluation 109.72229 xgboost #> 4 genderfemale 140.22323 xgboost #> 5 age 155.11231 xgboost #> 6 salary 155.69566 xgboost #> 7 hours 237.72157 xgboost #> 8 _baseline_ 358.49992 xgboost
plot(vd_xgb)