The cvms package is useful for cross-validating a list of linear and logistic regression model formulas in R. To speed up the process, I’ve added the option to cross-validate the models in parallel. In this post, I will walk you through a simple example and introduce the combine_predictors() function, which generates model formulas by combining a list of fixed effects. We will be using the simple participant.scores dataset from cvms.
I have spent the last couple of days adding functionality for performing repeated cross-validation to cvms and groupdata2. In this quick post I will show an example.
(Please note: At the moment, you need to use the github version of groupdata2. I hope to update it on CRAN this month.)
In cross-validation, we split our training set into a number (often denoted “k”) of groups called folds. We repeatedly train our machine learning model on k-1 folds and test it on the last fold, such that each fold becomes test set once.