| data.frame_tidiers {broom} | R Documentation |
These perform tidy summaries of data.frame objects. tidy produces
summary statistics about each column, while glance simply reports
the number of rows and columns. Note that augment.data.frame will
throw an error.
## S3 method for class 'data.frame' tidy(x, ...) ## S3 method for class 'data.frame' augment(x, data, ...) ## S3 method for class 'data.frame' glance(x, ...)
x |
A data.frame |
... |
extra arguments: for |
data |
data, not used |
The tidy method calls the psych method
describe directly to produce its per-columns summary
statistics.
tidy.data.frame produces a data frame with one
row per original column, containing summary statistics of each:
column |
name of original column |
n |
Number of valid (non-NA) values |
mean |
mean |
sd |
standard deviation |
median |
median |
trimmed |
trimmed mean, with trim defaulting to .1 |
mad |
median absolute deviation (from the median) |
min |
minimum value |
max |
maximum value |
range |
range |
skew |
skew |
kurtosis |
kurtosis |
se |
standard error |
glance returns a one-row data.frame with
nrow |
number of rows |
ncol |
number of columns |
complete.obs |
number of rows that have no missing values |
na.fraction |
fraction of values across all rows and columns that are missing |
td <- tidy(mtcars)
td
glance(mtcars)
library(ggplot2)
# compare mean and standard deviation
ggplot(td, aes(mean, sd)) + geom_point() +
geom_text(aes(label = column), hjust = 1, vjust = 1) +
scale_x_log10() + scale_y_log10() + geom_abline()