step_sample {recipes} | R Documentation |
step_sample
creates a specification of a recipe step
that will sample rows using dplyr::sample_n()
or
dplyr::sample_frac()
.
step_sample(recipe, ..., role = NA, trained = FALSE, size = NULL, replace = FALSE, skip = FALSE, id = rand_id("sample")) ## S3 method for class 'step_sample' tidy(x, ...)
recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
... |
Argument ignored; included for consistency with other step
specification functions. For the |
role |
Not used by this step since no new variables are created. |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
size |
An integer or fraction. If the value is within (0, 1),
|
replace |
Sample with or without replacement? |
skip |
A logical. Should the step be skipped when the
recipe is baked by |
id |
A character string that is unique to this step to identify it. |
x |
A |
An updated version of recipe
with the new step
added to the sequence of existing steps (if any). For the
tidy
method, a tibble with columns size
, replace
,
and id
.
# Uses `sample_n` recipe( ~ ., data = iris) %>% step_sample(size = 1) %>% prep(training = iris, retain = TRUE) %>% juice() %>% nrow() # Uses `sample_frac` recipe( ~ ., data = iris) %>% step_sample(size = 0.9999) %>% prep(training = iris, retain = TRUE) %>% juice() %>% nrow() # Uses `sample_n` and returns _at maximum_ 120 samples. smaller_iris <- recipe( ~ ., data = iris) %>% step_sample() %>% prep(training = iris %>% slice(1:120), retain = TRUE) juice(smaller_iris) %>% nrow() bake(smaller_iris, iris %>% slice(121:150)) %>% nrow()