bosclust {ordinalClust}R Documentation

Function to perform a clustering

Description

This function performs a clustering on ordinal data by using the multiple latent block model (cf references for further details). It allows the user to define D groups of variables that have different number of levels. A BOS distribution is used, and the parameters inference is realized with an SEM-Gbbs algorithm.

Usage

bosclust(x, idx_list=c(1), kr, init, nbSEM, nbSEMburn, 
        nbindmini, m=0, percentRandomB=0)

Arguments

x

Matrix made of ordinal data, of dimension N*Jtot. The features with same numbers of levels must be placed side by side. The missing values should be coded as NA.

idx_list

Vector of length D. This argument is useful when variables have different numbers of levels. Element d should indicate where the variables with number of levels m[d] begins in matrix x.

kr

Number of row clusters.

m

Vector of length D. d^th element defines the ordinal data's number of levels.

nbSEM

Number of SEM-Gibbs iterations realized to estimate parameters.

nbSEMburn

Number of SEM-Gibbs burning iterations for estimating parameters. This parameter must be inferior to nbSEM.

nbindmini

Minimum number of cells belonging to a block.

init

String that indicates the kind of initialisation. Must be one of th following words : "kmeans", "random" or "randomBurnin".

percentRandomB

Vector of length 1. Indicates the percentage of resampling when init is equal to "randomBurnin".

Value

@V

Matrix of dimension N*kr such that V[i,g]=1 if i belongs to cluster g.

@zr

Vector of length N with resulting row partitions.

@pi

Vector of length kr. Row mixing proportions.

@m

Vector of length D. d^th element represents the number of levels of d^th group of variables.

@icl

ICL value for clustering.

@name

Name of the result.

@params

List of length D. d^th item stores the resulting position and precision parameters mu and pi.

@paramschain

List of length nbSEMburn. For each iteration of the SEM-Gibbs algorithm, the parameters of the blocks are stored.

@xhat

List of length D. d^th item represents the d^th group of variables dataset, with missing values completed.

@zrchain

Matrix of dimension nbSEM*N. Row i represents the row cluster partitions at iteration i.

@pichain

List of length nbSEM. Item i is a vector of length kr which contains the row mixing proportions at iteration i.

Author(s)

Margot Selosse, Julien Jacques, Christophe Biernacki.

Examples



  library(ordinalClust)
  data("dataqol")
  set.seed(5)

  # loading the ordinal data
  M <- as.matrix(dataqol[,2:29])

  m = 4

  krow = 4

  nbSEM=50
  nbSEMburn=40
  nbindmini=2
  init = "random"


  object <- bosclust(x=M,kr=krow, m=m, nbSEM=nbSEM,
      nbSEMburn=nbSEMburn, nbindmini=nbindmini, init=init)
    
  

[Package ordinalClust version 1.3.4 Index]