fmrs.varsel {fmrs} | R Documentation |
Provides variable selection and penalized MLE for
Finite Mixture of Accelerated Failure Time Regression (FMAFTR) Models
and Finite Mixture of Regression (FMR
) Models.
It also provide Ridge Regression and Elastic Net.
fmrs.varsel(y, delta, x, nComp, ...) ## S4 method for signature 'ANY' fmrs.varsel(y, delta, x, nComp, disFamily = "lnorm", initCoeff, initDispersion, initmixProp, penFamily = "lasso", lambPen, lambRidge = 0, nIterEM = 2000, nIterNR = 2, conveps = 1e-08, convepsEM = 1e-08, convepsNR = 1e-08, porNR = 2, gamMixPor = 1, activeset, lambMCP, lambSICA)
y |
Responses (observations) |
delta |
Censoring indicators |
x |
Design matrix (covariates) |
nComp |
Order (Number of components) of mixture model |
... |
Other possible options |
disFamily |
A sub-distribution family. The options
are |
initCoeff |
Vector of initial values for regression coefficients including intercepts |
initDispersion |
Vector of initial values for standard deviations |
initmixProp |
Vector of initial values for proportion of components |
penFamily |
Penalty name that is used in variable selection method
The available options are |
lambPen |
A vector of positive numbers for tuning parameters |
lambRidge |
A positive value for tuning parameter in Ridge Regression or Elastic Net |
nIterEM |
Maximum number of iterations for EM algorithm |
nIterNR |
Maximum number of iterations for Newton-Raphson algorithm |
conveps |
A positive value for avoiding NaN in computing divisions |
convepsEM |
A positive value for threshold of convergence in EM algorithm |
convepsNR |
A positive value for threshold of convergence in NR algorithm |
porNR |
A positive interger for maximum number of searches in NR algorithm |
gamMixPor |
Proportion of mixing parameters in the penalty. The
value must be in the interval [0,1]. If |
activeset |
A matrix of zero-one that shows which intercepts and covariates are active in the fitted fmrs model |
lambMCP |
A positive numbers for |
lambSICA |
A positive numbers for |
The penalized likelihood of a finite mixture of AFT regression models is written as
\tilde\ell_{n}(\boldsymbolΨ) =\ell_{n}(\boldsymbolΨ) - \mathbf{p}_{\boldsymbolλ_{n}}(\boldsymbolΨ)
where
\mathbf{p}_{\boldsymbolλ_{n}}(\boldsymbolΨ) = ∑\limits_{k=1}^{K}π_{k}^α≤ft\{ ∑\limits_{j=1}^{d}p_{λ_{n,k}}(β_{kj}) \right\}.
In the M step of EM algorithm the function
\tilde{Q}(\boldsymbolΨ,\boldsymbolΨ^{(m)}) =∑\limits_{k=1}^{K} \tilde{Q}_{k}(\boldsymbolΨ_k, \boldsymbolΨ^{(m)}_k) = ∑\limits_{k=1}^{K} ≤ft[{Q}_{k}(\boldsymbolΨ_k, \boldsymbolΨ^{(m)}_k) - π_{k}^α≤ft\{ ∑\limits_{j=1}^{d}p_{λ_{n,k}}(β_{kj}) \right\}\right]
is maximized. Since the penalty function is singular at origin, we use a local quadratic approximation (LQA) for the penalty as follows,
\mathbf{p}^\ast_{k,\boldsymbolλ_{n}} (\boldsymbolβ,\boldsymbolβ^{(m)}) =(π_{k}^{(m)})^{α}∑\limits_{j=1}^{d}≤ft\{ p_{λ_{n,k}}(β_{kj}^{(m)}) + { p^{\prime}_{λ_{n,k}} (β_{kj}^{(m)}) \over 2β_{kj}^{(m)}}(β_{kj}^{2} - {β_{kj}^{(m)}}^{2}) \right\}.
Therefore maximizing Q is equivalent to maximizing the function
{Q}^\ast(\boldsymbolΨ,\boldsymbolΨ^{(m)}) =∑\limits_{k=1}^{K} {Q}^\ast_{k}(\boldsymbolΨ_k, \boldsymbolΨ^{(m)}_k) = ∑\limits_{k=1}^{K} ≤ft[{Q}_{k}(\boldsymbolΨ_k,\boldsymbolΨ^{(m)}_k)- \mathbf{p}^\ast_{k,\boldsymbolλ_{n}}(\boldsymbolβ, \boldsymbolβ^{(m)})\right].
In case of Log-Normal sub-distributions, the maximizers of Q_k functions are as follows. Given the data and current estimates of parameters, the maximizers are
{\boldsymbolβ}^{(m+1)}_{k} =({\boldsymbol z}^{\prime}\boldsymbolτ^{(m)}_{k}{\boldsymbol z}+ \varpi_{k}(\boldsymbolβ_{kj}^{(m)}))^{-1}{\boldsymbol z}^{\prime} \boldsymbolτ^{(m)}_{k}T^{(m)}_{k},
where \varpi_{k}(\boldsymbolβ_{kj}^{(m)})={diag} ≤ft(≤ft(π_{k}^{(m+1)}\right)^α \frac{{p}^{\prime}_{λ_{n},k}(\boldsymbolβ_{kj}^{(m)})} {\boldsymbolβ_{kj}^{(m)}}\right) and σ_{k}^{(m+1)} is equal to
σ_{k}^{(m+1)} =√{\frac{∑\limits_{i=1}^{n}τ^{(m)}_{ik} (t^{(m)}_{ik} -{\boldsymbol z}_{i}\boldsymbolβ^{(m)}_{k})^{2}} {∑\limits_{i=1}^{n}τ^{(m)}_{ik} {≤ft[δ_{i} +(1-δ_{i})\{A(w^{(m)}_{ik})[A(w^{(m)}_{ik})- w^{(m)}_{ik}]\}\right]}}}.
For the Weibull distribution, on the other hand, we have \tilde{\boldsymbolΨ}^{(m+1)}_k =\tilde{\boldsymbolΨ}^{(m)}_k - 0.5^{κ}≤ft[{H_{k}^{p,(m)}}\right]^{-1}I_{k}^{p,(m)}, where H^p_{k}=H_k+h(\boldsymbolΨ_k) is the penalized version of hessian matrix and I^p_{k}=I_k+h(\boldsymbolΨ_k)\boldsymbolΨ_k is the penalized version of vector of first derivatives evaluated at \tilde{\boldsymbolΨ}_k^{(m)}.
Farhad Shokoohi <shokoohi@icloud.com>
Shokoohi, F., Khalili, A., Asgharian, M. and Lin, S. (2016 submitted) Variable Selection in Mixture of Survival Models for Biomedical Genomic Studies
Other lnorm..norm..weibull: fmrs.gendata
,
fmrs.mle
, fmrs.tunsel
set.seed(1980) nComp = 2 nCov = 10 nObs = 500 dispersion = c(1, 1) mixProp = c(0.4, 0.6) rho = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) umax = 40 dat <- fmrs.gendata(nObs = nObs, nComp = nComp, nCov = nCov, coeff = c(coeff1, coeff2), dispersion = dispersion, mixProp =mixProp, rho = rho, umax = umax, disFamily = "lnorm") res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = nComp, disFamily = "lnorm", initCoeff = rnorm(nComp*nCov+nComp), initDispersion = rep(1, nComp), initmixProp = rep(1/nComp, nComp)) res.lam <- fmrs.tunsel(y = dat$y, x = dat$x, delta = dat$delta, nComp = ncomp(res.mle), disFamily = "lnorm", initCoeff=c(coefficients(res.mle)), initDispersion = dispersion(res.mle), initmixProp = mixProp(res.mle), penFamily = "adplasso") res.var <- fmrs.varsel(y = dat$y, x = dat$x, delta = dat$delta, nComp = ncomp(res.mle), disFamily = "lnorm", initCoeff=c(coefficients(res.mle)), initDispersion = dispersion(res.mle), initmixProp = mixProp(res.mle), penFamily = "adplasso", lambPen = slot(res.lam, "lambPen")) coefficients(res.var)[-1,] round(coefficients(res.var)[-1,],5)