This function selects subset of rows from data set. This is usefull if data is large and we need just a sample to calculate profiles.

select_neighbours(data, observation, variables = NULL,
distance = gower::gower_dist, n = 20, frac = NULL)

## Arguments

data set of observations single observation variables that shall be used for calculation of distance. By default these are all variables present in data and observation distance function, by default the gower_dist function. number of neighbours to select if n is not specified (NULL), then will be calculated as frac * number of rows in data. Either n or frac need to be specified.

## Value

a data frame with selected rows

## Details

Note that select_neighbours function is S3 generic. If you want to work on non standard data sources (like H2O ddf, external databases) you should overload it.

## Examples

library("DALEX")

new_apartment <- apartments[1, 2:6]
small_apartments <- select_neighbours(apartmentsTest, new_apartment, n = 10)
new_apartment#>   construction.year surface floor no.rooms    district
#> 1              1953      25     3        1 Srodmiesciesmall_apartments#>      m2.price construction.year surface floor no.rooms    district
#> 2285     5875              1970      27     3        1 Srodmiescie
#> 1073     5886              1960      36     2        1 Srodmiescie
#> 8110     5614              1957      44     4        1 Srodmiescie
#> 9527     6080              1947      27     1        1 Srodmiescie
#> 3261     5859              1945      39     2        1 Srodmiescie
#> 4309     5794              1947      31     3        2 Srodmiescie
#> 1198     5821              1947      43     2        1 Srodmiescie
#> 6647     5952              1938      30     2        1 Srodmiescie
#> 4027     6457              1926      29     3        1 Srodmiescie
#> 2655     5596              1950      25     6        1 Srodmiescie