split-methods {rebmix} | R Documentation |
Returns (invisibly) the object containing train and test observations \bm{y}_{1}, …, \bm{y}_{n} as well as true class membership \bm{Ω}_{g} for the test dataset.
## S4 method for signature 'numeric' split(p = 0.75, Dataset = data.frame(), class = numeric(), ...) ## S4 method for signature 'list' split(p = list(), Dataset = data.frame(), class = numeric(), ...) ## ... and for other signatures
p |
see Methods section below. |
Dataset |
a data frame containing dataset Y of length n. For the dataset the corresponding class membership \bm{Ω}_{g} is known.
The default value is |
class |
a column number in |
... |
further arguments to |
Returns an object of class RCLS.chunk
.
signature(p = "numeric")
a number specifying the fraction of observations for training 0.0 ≤q p ≤q 1.0. The default value is 0.75
.
signature(p = "list")
a list composed of column number p$type
in Dataset
containing the type membership information followed by the corresponding train p$train
and test p$test
values.
The default value is list()
.
Marko Nagode
data("iris") # Split dataset into train (75%) and test (25%) subsets. set.seed(5) Iris <- split(p = 0.75, Dataset = iris, class = 5) Iris # Generate simulated dataset. N <- 1000 class <- c(rep("A", 0.4 * N), rep("B", 0.2 * N), rep("C", 0.1 * N), rep("D", 0.05 * N), rep("E", 0.25 * N)) type <- c(rep("train", 0.75 * N), rep("test", 0.25 * N)) n <- 30 Dataset <- data.frame(1:n, sample(class, n)) colnames(Dataset) <- c("y", "class") # Split dataset into train (60%) and test (40%) subsets. simulated <- split(p = 0.6, Dataset = Dataset, class = 2) simulated # Generate simulated dataset. Dataset <- data.frame(1:n, sample(class, n), sample(type, n)) colnames(Dataset) <- c("y", "class", "type") # Split dataset into train and test subsets. simulated <- split(p = list(type = 3, train = "train", test = "test"), Dataset = Dataset, class = 2) simulated