cross_validate() in R
In the process of learning cross-validation in R for my degree in Cognitive Science, I decided to write a cross-validation function that could handle most of the tasks we were faced with. This seemed a good way to learn more about the subject and practice my R skills.
The function can (so far) be used with gaussian and binomial models – lm(), lmer(), glm() og glmer().
cross_validate() creates balanced folds (balanced on 1 variable so far).
For every fold it:
- creates a training set and a test set
- trains the model on the training set
- predicts the dependent variable on the test set
With gaussian models – lm() and lmer() – it returns the average values of RMSE, r2m, r2c, AIC, and BIC.
With binomial models – glm() and glmer() – it uses the predictions to make a confusion matrix and a ROC curve. The associated values – Area Under the Curve, Sensitivity, Specificity, etc. – are returned.
With both models it counts convergence warnings, so it is possible to adjust the model or choose another.
cross_validate_list() is used to cross-validate multiple models at once. This yields a dataframe with the previously mentioned values, making model comparison easy.
Find the latest versions of the code and the manual here: