sdwd {sdwd} | R Documentation |
Fits the sparse distance weighted discrimination (DWD) model with imposing L1, elastic-net, or adaptive elastic-net penalties. The solution path is computed at a grid of values of tuning parameter lambda
. This function is modified based on the glmnet
and the gcdnet
packages.
sdwd(x, y, nlambda=100, lambda.factor=ifelse(nobs < nvars, 0.01, 1e-04), lambda=NULL, lambda2=0, pf=rep(1, nvars), pf2=rep(1, nvars), exclude, dfmax=nvars + 1, pmax=min(dfmax * 1.2, nvars), standardize=TRUE, eps=1e-8, maxit=1e6, strong=T)
x |
A matrix with N rows and p columns for predictors. |
y |
A vector of length p for binary responses. The element of |
nlambda |
The number of |
lambda.factor |
The ratio of the smallest to the largest |
lambda |
An optional user supplied |
lambda2 |
The L2 tuning parameter lambda2. |
pf |
A vector of length p representing the L1 penalty weights to each coefficient of beta for adaptive L1 or adaptive elastic net. |
pf2 |
A vector of length p for L2 penalty factor for adaptive L1 or adaptive elastic net. To allow different L2 shrinkage, user can set |
exclude |
Whether to exclude some predictors from the model. This is equivalent to adopting an infinite penalty factor when excluding some predictor. Default is none. |
dfmax |
Restricts at most how many predictors can be incorporated in the model. Default is p+1. This restriction is helpful when p is large, provided that a partial path is acceptable. |
pmax |
Restricts the maximum number of variables ever to be nonzero; e.g, once some β enters the model, it counts once. The count will not change when the β exits or re-enters the model. Default is |
standardize |
Whether to standardize the data. If |
eps |
The algorithm stops when (i.e. 4*max(j)(beta_new[j]-beta_old[j])^2 is less than |
maxit |
Restricts how many outer-loop iterations are allowed. Default is 1e6. Consider increasing |
strong |
If |
The sdwd
minimizes the sparse penalized DWD loss function,
L(y, X, beta))/N + lambda1 * ||beta||_1^1 + 0.5 * lambda2 * ||beta||_2^2,
where L(u) = 1 - u if u <= 0.5, 1 / (4*u) if u > 0.5 is the DWD loss. The value of lambda2
is user-specified.
To use the L1 penalty (lasso), set lambda2=0
. To use the elastic net, set lambda2
as nonzero. To use the adaptive L1, set lambda2=0
and specify pf
and pf2
. To use the adaptive elastic net, set lambda2
as nonzero and specify pf
and pf2
as well.
When the algorithm do not converge or run slow, consider increasing eps
, decreasing
nlambda
, or increasing lambda.factor
before increasing
maxit
.
An object with S3 class sdwd
.
b0 |
A vector of length |
beta |
A matrix of dimension |
df |
The number of nonzero coefficients at each |
dim |
The dimension of coefficient matrix, i.e., |
lambda |
The |
npasses |
Total number of iterations for all lambda values. |
jerr |
Warnings and errors; 0 if no error. |
call |
The call which produced this object. |
Boxiang Wang and Hui Zou
Maintainer: Boxiang Wang boxiang@umn.edu
Wang, B. and Zou, H. (2015)
“Sparse Distance Weighted Discrimination", Journal of Computational and Graphical Statistics, forthcoming.
http://arxiv.org/abs/1501.06066
Friedman, J., Hastie, T., and Tibshirani, R. (2010), "Regularization paths for generalized
linear models via coordinate descent," Journal of Statistical Software, 33(1), 1–22
http://www.jstatsoft.org/v33/i01/paper
Marron, J.S., Todd, M.J., Ahn, J. (2007)
“Distance-Weighted Discrimination"",
Journal of the American Statistical Association, 102(408), 1267–1271
https://faculty.franklin.uga.edu/jyahn/sites/faculty.franklin.uga.edu.jyahn/files/DWD3.pdf
Tibshirani, Robert., Bien, J., Friedman, J.,Hastie, T.,Simon,
N.,Taylor, J., and Tibshirani, Ryan. (2012)
Strong Rules for Discarding Predictors in Lasso-type Problems,
Journal of the Royal Statistical Society, Series B, 74(2), 245–266
http://statweb.stanford.edu/~tibs/ftp/strong.pdf
Yang, Y. and Zou, H. (2013)
“An Efficient Algorithm for Computing the HHSVM and Its Generalizations",
Journal of Computational and Graphical Statistics, 22(2), 396–415
http://users.stat.umn.edu/~yiyang/resources/papers/JCGS_gcdnet.pdf
print.sdwd
, predict.sdwd
, coef.sdwd
, plot.sdwd
, and cv.sdwd
.
# load the data data(colon) # fit the elastic-net penalized DWD with lambda2=1 fit = sdwd(colon$x, colon$y, lambda2=1) print(fit) # coefficients at some lambda value c1 = coef(fit, s=0.005) # make predictions predict(fit, newx=colon$x[1:10, ], s=c(0.01, 0.005))