Open Access Te Herenga Waka-Victoria University of Wellington
Browse

Clustering Categorical Data based on Finite Mixtures

Download (2.17 MB)
thesis
posted on 2022-05-15, 21:19 authored by Kien Tran

Clustering techniques are often performed to reduce the dimension of very large datasets, whose direct analysis using techniques such as regression can be computationally infeasible. The clustering of non-independent categorical variables in particular poses distinct difficulties due to its lack of a well-defined distance metric, while at the same time existing techniques tend to model variable correlations based on the latent group membership, requiring strong assumptions of conditional independence and low-to-moderate variable correlations.

This thesis proposes a joint model clustering approach for this data type based on finite mixture models, with a heavy focus on clustering based on correlation. In the case of row-clustering data with non-independent columns, we seek to model the dependency structure of columns using either pairwise joint models with misspecified likelihood, or full joint models. In the case of column clustering ordinal datasets with non-independent columns, this thesis proposes an anchor model approach where each column in the same group modelled by only one “anchor” column within the group. The result is a proper likelihood model which allows for straightforward parameter estimation while not being overly restrictive on the modelling space.

History

Copyright Date

2022-05-13

Date of Award

2022-05-13

Publisher

Te Herenga Waka—Victoria University of Wellington

Rights License

CC BY-NC 4.0

Degree Discipline

Statistics and Operations Research

Degree Grantor

Te Herenga Waka—Victoria University of Wellington

Degree Level

Doctoral

Degree Name

Doctor of Philosophy

ANZSRC Socio-Economic Outcome code

280118 Expanding knowledge in the mathematical sciences

ANZSRC Type Of Activity code

1 Pure basic research

Victoria University of Wellington Item Type

Awarded Doctoral Thesis

Language

en_NZ

Victoria University of Wellington School

School of Mathematics and Statistics

Advisors

Liu, Ivy; Arnold, Richard