3.1 Model performance

As you may remember from the previous chapter, the root mean square of residuals is identical for both considered models. Does it mean that these models are equally good?

## [1] 283.0865
## [1] 286.5357

Function model_performance() calculates predictions and residuals for validation dataset apartmentsTest.

Generic function print() returns quantiles for residuals.

##        0%       10%       20%       30%       40%       50%       60% 
## -472.3560 -423.9131 -398.2811 -370.8841  161.2473  174.0677  184.1412 
##       70%       80%       90%      100% 
##  195.8834  209.2460  221.4659  257.2555
##           0%          10%          20%          30%          40% 
## -1262.554308  -408.920183  -197.591180   -89.661883    -7.454146 
##          50%          60%          70%          80%          90% 
##    55.441061   108.398858   157.924244   218.241574   294.264602 
##         100% 
##   727.445065

The generic plot() function shows reversed empirical cumulative distribution function for absolute values from residuals. This function presents a fraction of residuals larger than x. The figure below shows that majority of residuals for the random forest is smaller than residuals for the linear model, yet the small fraction of very large residuals affects the root mean square.

(#fig:global_explain_ecdf)Comparison of residuals for linear model and random forest

Comparison of residuals for linear model and random forest

Use the geom = "boxplot" parameter for the generic plot() function to get an alternative comparison of residuals. The red dot stands for the root mean square.

(#fig:global_explain_boxplot)Comparison of residuals for linear model and random forest

Comparison of residuals for linear model and random forest