This function selects subset of rows from data set. This is usefull if data is large and we need just a sample to calculate profiles.

select_neighbours(data, observation, variables = NULL, distance = gower::gower_dist, n = 20, frac = NULL)

data | set of observations |
---|---|

observation | single observation |

variables | variables that shall be used for calculation of distance. By default these are all variables present in `data` and `observation` |

distance | distance function, by default the `gower_dist` function. |

n | number of neighbours to select |

frac | if `n` is not specified (NULL), then will be calculated as `frac` * number of rows in `data`. Either `n` or `frac` need to be specified. |

a data frame with selected rows

Note that `select_neighbours`

function is S3 generic.
If you want to work on non standard data sources (like H2O ddf, external databases)
you should overload it.

library("DALEX") new_apartment <- apartments[1, 2:6] small_apartments <- select_neighbours(apartmentsTest, new_apartment, n = 10) new_apartment#> construction.year surface floor no.rooms district #> 1 1953 25 3 1 Srodmiesciesmall_apartments#> m2.price construction.year surface floor no.rooms district #> 2285 5875 1970 27 3 1 Srodmiescie #> 1073 5886 1960 36 2 1 Srodmiescie #> 8110 5614 1957 44 4 1 Srodmiescie #> 9527 6080 1947 27 1 1 Srodmiescie #> 3261 5859 1945 39 2 1 Srodmiescie #> 4309 5794 1947 31 3 2 Srodmiescie #> 1198 5821 1947 43 2 1 Srodmiescie #> 6647 5952 1938 30 2 1 Srodmiescie #> 4027 6457 1926 29 3 1 Srodmiescie #> 2655 5596 1950 25 6 1 Srodmiescie