Open Access Te Herenga Waka-Victoria University of Wellington
thesis_access.pdf (2.54 MB)

Exploring a Bioinformatics Clustering Algorithm

Download (2.54 MB)
posted on 2021-11-01, 22:18 authored by Matti, Mukhlis

This thesis explores and evaluates MAXCCLUS, a bioinformatics clustering algorithm, which was designed to be used to cluster genes from microarray experimental data. MAXCCLUS does the clustering of genes depending on the textual data that describe the genes. MAXCCLUS attempts to create clusters of which it selects only the statistically significant clusters by running a significance test. It then attempts to generalise these clusters by using a simple greedy generalisation algorithm. We explore the behaviour of MAXCCLUS by running several clustering experiments that investigate various modifications to MAXCCLUS and its data. The thesis shows (a) that using the simple generalisation algorithm of MAXCCLUS gives better result than using an exhaustive search algorithm for generalisation, (b) the significance test that MAXCCLUS uses needs to be modified to take into consideration the dependency of some genes on other genes functionally, (c) it is advantageous to delete the non domain-relevant textual data that describe the genes but disadvantageous to add more textual data to describe the genes, and (d) that MAXCCLUS behaves poorly when it attempts to cluster genes that have adjacent categories instead of having two distinct categories only.


Copyright Date


Date of Award



Te Herenga Waka—Victoria University of Wellington

Rights License

Author Retains Copyright

Degree Discipline

Computer Science

Degree Grantor

Te Herenga Waka—Victoria University of Wellington

Degree Level


Degree Name

Master of Science

Victoria University of Wellington Item Type

Awarded Research Masters Thesis



Victoria University of Wellington School

School of Mathematics, Statistics and Computer Science


Andreae, Peter