Optimisation of Likelihood for Bernoulli Mixture Models
The use of mixture models in statistical analysis is increasing for datasets with heterogeneity and/or redundancy in the data. They are likelihood based models, and maximum likelihood estimates of parameters are obtained by the use of the expectation maximization (EM) algorithm. Multi-modality of the likelihood surface means that the EM algorithm is highly dependent on starting points and poorly chosen initial points for the optimization may lead to only a local maximum, not the global maximum. In this thesis, different methods of choosing initialising points in the EM algorithm will be evaluated and two procedures which make intelligent choices of possible starting points and fast evaluations of their usefulness will be presented. Furthermore, several approaches to measure the best model to fit from a set of models for a given dataset, will be investigated and some lemmas and theorems are presented to illustrate the information criterion. This work introduces two novel and heuristic methods to choose the best starting points for the EM algorithm that are named Combined method and Hybrid PSO (Particle Swarm Optimisation). Combined method is based on a combination of two clustering methods that leads to finding the best starting points in the EM algorithm in comparison with the different initialisation point methods. Hybrid PSO is a hybrid method of Particle Swarm Optimization (PSO) as a global optimization approach and the EM algorithm as a local search to overcome the EM algorithm’s problem that makes it independent to starting points. Finally it will be compared with different methods of choosing starting points in the EM algorithm.