RCLRMIX-class {rebmix} | R Documentation |
"RCLRMIX"
Object of class RCLRMIX
.
Objects can be created by calls of the form new("RCLRMIX", ...)
.
Accessor methods for the slots are a.pos(x = NULL)
, a.Zt(x = NULL)
, a.Zp(x = NULL, s = expression(c))
, a.c(x = NULL)
,
a.p(x = NULL, s = expression(c))
, a.pi(x = NULL, s = expression(c))
,
a.P(x = NULL, s = expression(c))
, a.tau(x = NULL, s = expression(c))
,
a.prob(x = NULL)
, a.from(x = NULL)
, a.to(x = NULL)
, a.EN(x = NULL)
and a.ED(x = NULL)
, where x
stands for an object of class RCLRMIX
and s
a desired number of clusters for which the slot is calculated.
x
:an object of class REBMIX
.
pos
:a desired row number in x@summary
to be clustered. The default value is 1
.
Zt
:a factor of true cluster membership.
Zp
:a factor of predictive cluster membership.
c
:number of clusters.
p
:a vector of length c containing prior probabilities of cluster memberships p_{l} summing to 1. The value is returned only if all variables in slot x
follow either binomial or Dirac parametric families. The default value is numeric()
.
pi
:a list of length d of matrices of size c \times K_{i} containing cluster conditional probabilities π_{ilk}. Let π_{ilk}
denote the cluster conditional probability that an observation in cluster l = 1, …, c produces the kth outcome on the ith variable.
Suppose we observe i = 1, …, d polytomous categorical variables (the manifest variables), each of which contains K_{i} possible outcomes for observations j = 1, …, n.
A manifest variable is a variable that can be measured or observed directly. It must be coded as whole number starting at zero for the first outcome and increasing to the possible number of outcomes minus one.
It is presumed here that all variables are statistically independentand within clusters and that \bm{y}_{1}, …, \bm{y}_{n}
stands for an observed d dimensional dataset of size n of vector observations \bm{y}_{j} = (y_{1j}, …, y_{ij}, …, y_{dj})^\top.
The value is returned only if all variables in slot x
follow either binomial or Dirac parametric families. The default value is list()
.
P
:a data frame containing true N_{\mathrm{t}}(\bm{y}_{\tilde{\jmath}}) and predictive N_{\mathrm{p}}(\bm{y}_{\tilde{\jmath}}) frequencies calculated for unique \bm{y}_{\tilde{\jmath}} \in \{ \bm{y}_{1}, …, \bm{y}_{n} \}, where \tilde{\jmath} = 1, …, \tilde{n}$ and $\tilde{n} ≤q n.
tau
:a matrix of size n \times c containing conditional probabilities τ_{jl} that observations \bm{y}_{1}, …, \bm{y}_{n} arise from clusters 1, …, c.
prob
:a vector of length c containing probabilities of correct clustering for s = 1, …, c.
from
:a vector of length c - 1 containing clusters merged to to
clusters.
to
:a vector of length c - 1 containing clusters originating from from
clusters.
EN
:a vector of length c - 1 containing entropies for combined clusters.
ED
:a vector of length c - 1 containing decrease of entropies for combined clusters.
Marko Nagode
J. P. Baudry, A. E. Raftery, G. Celeux, K. Lo and R. Gottardo. Combining mixture components for clustering. Journal of Computational and Graphical Statistics, 19(2):332-353, 2010. https://doi.org/10.1198/jcgs.2010.08111
devAskNewPage(ask = TRUE) # Generate normal dataset. n <- c(500, 200, 400) Theta <- new("RNGMVNORM.Theta", c = 3, d = 2) a.theta1(Theta, 1) <- c(3, 10) a.theta1(Theta, 2) <- c(8, 6) a.theta1(Theta, 3) <- c(12, 11) a.theta2(Theta, 1) <- c(3, 0.3, 0.3, 2) a.theta2(Theta, 2) <- c(5.7, -2.3, -2.3, 3.5) a.theta2(Theta, 3) <- c(2, 1, 1, 2) normal <- RNGMIX(model = "RNGMVNORM", Dataset.name = "normal_1", n = n, Theta = a.Theta(Theta)) # Estimate number of components, component weights and component parameters. normalest <- REBMIX(model = "REBMVNORM", Dataset = a.Dataset(normal), Preprocessing = "histogram", cmax = 6, Criterion = "BIC", pdf = rep("normal", 2)) summary(normalest) # Plot finite mixture. plot(normalest) # Cluster dataset. normalclu <- RCLRMIX(model = "RCLRMVNORM", x = normalest, Zt = a.Zt(normal)) # Plot clusters. plot(normalclu) summary(normalclu)