The archivist package allows to store, restore and look for R objects in repositories stored on hard disk. There are different strategies that can be used to find an object, through it’s name, date of creation of meta data. The package is mainly designed as a repository of artifacts, but it can be used in different use-cases.
Let’s see how it can be used as caching engine.
Let’s consider a function with few arguments, which evaluation may takes a significant amount of time. If there is a chance that the function will be executed with same parameteres more than just one, it would be desireble to cache results to avoid unncessary evaluations.
Such cache can be easily constructed with the
Let’s see an example. The
getMaxDistribution summarizes the distribution of maximum from N draw of random variables from distribuition D with the use of R replications.
Now, let’s load the archivist package and prepare a repository for cached objects.
cacheRepo is a folder with already evaluated function calls. How to use it?
The second evaluation of
getMaxDistribution is much, much faster. Results are just read from disk.
It create a md5 signature of the function FUN and it’s arguments and use this signature as a key. If such key is present in the cache repository, then the object is just restored. If it’s not present then the call is evaluated and result is stored. Note that, if
cacheRepo is a shared folder, then you get a shared cache repository!