groupdata2 is my first R package. It is meant to make grouping / splitting of data easy, while containing a large range of methods for grouping in different contexts. It contains a function for creating balanced folds for cross-validation.
Returns a factor with group numbers, e.g. (1,1,1,2,2,2,3,3,3).
This can be used to subset, aggregate, group_by, etc.
Create equally sized groups by setting force_equal = TRUE
Randomize grouping factor by setting randomize = TRUE
Returns the given data as a dataframe with added grouping factor made with group_factor(). The dataframe is grouped by the grouping factor for easy use with dplyr pipelines.
Creates the specified groups with group_factor() and splits the given data by the grouping factor with base::split. Returns the splits in a list.
Creates (optionally) balanced folds for use in cross-validation. Balance folds on one categorical variable and/or make sure that all datapoints sharing an ID is in the same fold.
There is a wide range of methods for creating the groups and more are own their way.