boscoclust {ordinalClust}R Documentation

Function to perform a co-clustering

Description

This function performs a co-clustering on ordinal data by using the latent block model (cf references for further details). A BOS distribution is used, and the parameters inference is realized with an SEM-Gbbs algorithm.

Usage

boscoclust(x=matrix(0,nrow=1,ncol=1), idx_list=c(1), kr, kc, init, nbSEM, nbSEMburn, 
          nbRepeat=1, nbindmini, m=0, percentRandomB=0)

Arguments

x

Matrix made of ordinal data, of dimension N*Jtot. The features with same numbers of levels must be placed side by side. The missing values should be coded as NA.

idx_list

Vector of length D. This argument is useful when variables have different numbers of levels. Element d should indicate where the variables with number of levels m[d] begins in matrix x.

kr

Number of row classes.

kc

Vector of length D. d^th element indicates the number of column clusters. Set to 0 to choose a classical multivariate BOS model.

m

Vector of length D. d^th element defines the ordinal data's number of levels.

nbSEM

Number of SEM-Gibbs iterations realized to estimate parameters.

nbSEMburn

Number of SEM-Gibbs burning iterations for estimating parameters. This parameter must be inferior to nbSEM.

nbRepeat

Number of times sampling on rows and on colums will be done at each SEM-Gibbs iteration.

nbindmini

Minimum number of cells belonging to a block.

init

String that indicates the kind of initialisation. Must be one of th following words : "kmeans", "random" or "randomBurnin".

percentRandomB

Vector of length 2. Indicates the percentage of resampling when init is equal to "randomBurnin".

Value

@V

Matrix of dimension N*kr such that V[i,g]=1 if i belongs to cluster g.

@icl

ICL value for co-clustering.

@name
@paramschain

List of length nbSEMburn. For each iteration of the SEM-Gibbs algorithm, the parameters of the blocks are stored.

@pichain

List of length nbSEM. Item i is a vector of length kr which contains the row mixing proportions at iteration i.

@rhochain

List of length nbSEM. Item i is a list of length D whose d^th contains the column mixing proportions of groups of variables d, at iteration i.

@zc

List of length D. d^th item is a vector of length J[d] representing the columns partitions for the group of variables d.

@zr

Vector of length N with resulting row partitions.

@W

List of length D. Item d is a matrix of dimension J*kc[d] such that W[j,h]=1 if j belongs to cluster h.

@m

Vector of length D. d^th element represents the number of levels of d^th group of variables.

@params

List of length D. d^th item represents the blocks paramaters for group of variables d.

@pi

Vector of length kr. Row mixing proportions.

@rho

List of length D. d^th item represents the column mixing proportion for d^th group of variables.

@xhat

List of length D. d^th item represents the d^th group of variables dataset, with missing values completed.

@zrchain

Matrix of dimension nbSEM*N. Row i represents the row cluster partitions at iteration i.

@zrchain

List of length D. Item d is a matrix of dimension nbSEM*J[d]. Row i represents the column cluster partitions at iteration i.

Author(s)

Margot Selosse, Julien Jacques, Christophe Biernacki.

Examples

  
  
    
  library(ordinalClust)

  # loading the real dataset
  data("dataqol")
  set.seed(5)

  # loading the ordinal data
  M <- as.matrix(dataqol[,2:29])


  # defining different number of categories:
  m=4


  # defining number of row and column clusters
  krow = 5
  kcol = 4

  # configuration for the inference
  nbSEM=50
  nbSEMburn=40
  nbindmini=2
  init = "kmeans"


  # Co-clustering execution
  object <- boscoclust(x=M,kr=krow,kc=kcol,m=m,nbSEM=nbSEM,
            nbSEMburn=nbSEMburn, nbindmini=nbindmini, init=init)

  
  

[Package ordinalClust version 1.3.4 Index]