posted on 2021-11-01, 22:18authored byMatti, Mukhlis
<p>This thesis explores and evaluates MAXCCLUS, a bioinformatics clustering algorithm, which was designed to be used to cluster genes from microarray experimental data. MAXCCLUS does the clustering of genes depending on the textual data that describe the genes. MAXCCLUS attempts to create clusters of which it selects only the statistically significant clusters by running a significance test. It then attempts to generalise these clusters by using a simple greedy generalisation algorithm. We explore the behaviour of MAXCCLUS by running several clustering experiments that investigate various modifications to MAXCCLUS and its data. The thesis shows (a) that using the simple generalisation algorithm of MAXCCLUS gives better result than using an exhaustive search algorithm for generalisation, (b) the significance test that MAXCCLUS uses needs to be modified to take into consideration the dependency of some genes on other genes functionally, (c) it is advantageous to delete the non domain-relevant textual data that describe the genes but disadvantageous to add more textual data to describe the genes, and (d) that MAXCCLUS behaves poorly when it attempts to cluster genes that have adjacent categories instead of having two distinct categories only.</p>
History
Copyright Date
2004-01-01
Date of Award
2004-01-01
Publisher
Te Herenga Waka—Victoria University of Wellington
Rights License
Author Retains Copyright
Degree Discipline
Computer Science
Degree Grantor
Te Herenga Waka—Victoria University of Wellington
Degree Level
Masters
Degree Name
Master of Science
Victoria University of Wellington Item Type
Awarded Research Masters Thesis
Language
en_NZ
Victoria University of Wellington School
School of Mathematics, Statistics and Computer Science