Chapter 20 Accumulated Local Profiles

As we showed in the previous chapter, Conditional Dependency Profiles takes into account dependency between features, but it is both advantage and disadvantage. The advantage is that in some cases the dependency is real and should be taken into account when computing expected value of \(f\). The disadvantage is that in the Conditional Dependency Profiles we see both effects of the feature of interest \(x^j\) and other features that are dependent on it. Accumulated Local Profiles disentangle effects of a feature of interest and features correlated with it.

For example, for the apartments dataset one can expect that features like \(surface\) and \(number.of.rooms\) are correlated but we can also imagine that each of these variables affect the apartment price somehow. Partial Dependency Profiles show how the average price changes as a function of surface, keeping all other variables unchanged. Conditional Dependency Profiles show how the average price changes as a function of surface adjusting all other variables to the current value of the surface. Accumulated Local Profiles show how the average price changes as a function of surface adjusting all other variables to the current value of the surface but extracting changes caused by these other features.

Accumulated Local Dependency Profiles presented in the (Apley 2018a) paper.

The general idea is to accumulate local changes in model response affected by single feature \(x^j\).

20.1 Definition

Accumulated Local Profile for a model \(f\) and a variable \(x^j\) is defined as

\[ g_{AL}^{f, j}(z) = \int_{z_0}^z E\left[\frac{\partial f(X^j, X^{-j})}{\partial x_j}|X^j = v\right] dv + c, \] where \(z_0\) if the lower boundry of \(x^j\). The profile \(g_{AL}^{f, j}(z)\) is calculated up to some constant \(c\). Usually the constant \(c\) is selected to keep average \(g_{AL}^{f, j}\) equal to 0 or average \(f\).

The equation may be a bit complex, but the intuition is not that complicated. Instead of aggregation of Ceteris Paribus we just look locally how quickly CP profiles are changing. And AL profile is reconstructed from such local partial changes.

So it’s an cummulated expected change of the model response along where the expected values are calculated over conditional distribution \((X^j,X^{-j})|X^j=v\).

Exercise

Let \(f = x_1 + x_2\) and distribuion of \((x_1, x_2)\) is given by \(x_1 \sim U[0,1]\) and \(x_2=x_1\).

Calculate \(g_{AD}^{f, 1}(z)\).

Answer \(g_{AD}^{f, 1}(z) = z\).

Accumulated Local Effects for 100 observations

(#fig:ale_part_1)Accumulated Local Effects for 100 observations

References

Apley, Dan. 2018a. ALEPlot: Accumulated Local Effects (Ale) Plots and Partial Dependence (Pd) Plots. https://CRAN.R-project.org/package=ALEPlot.