posted on 2021-11-23, 10:45authored byButler-Yeoman, Tony
<p>The ability to extract and model the meaning in data has been key to the success of modern machine learning. Typically, data reflects a combination of multiple sources that are mixed together. For example, photographs of people’s faces reflect the subject of the photograph, lighting conditions, angle, and background scene. It is therefore natural to wish to extract these multiple, largely independent, sources, which is known as disentangling in the literature. Additional benefits of disentangling arise from the fact that the data is then simpler, meaning that there are fewer free parameters, which reduces the curse of dimensionality and aids learning. While there has been a lot of research into finding disentangled representations, it remains an open problem. This thesis considers a number of approaches to a particularly difficult version of this task: we wish to disentangle the complex causes of data in an entirely unsupervised setting. That is, given access only to unlabeled, entangled data, we search for algorithms that can identify the generative factors of that data, which we call causes. Further, we assume that causes can themselves be complex and require a high-dimensional representation. We consider three approaches to this challenge: as an inference problem, as an extension of independent components analysis, and as a learning problem. Each method is motivated, described, and tested on a set of datasets build from entangled combinations of images, most commonly MNIST digits. Where the results fall short of disentangling, the reasons for this are dissected and analysed. The last method that we describe, which is based on combinations of autoencoders that learn to predict each other’s output, shows some promise on this extremely challenging problem.</p>
History
Copyright Date
2017-01-01
Date of Award
2017-01-01
Publisher
Te Herenga Waka—Victoria University of Wellington
Rights License
Author Retains Copyright
Degree Discipline
Computer Science
Degree Grantor
Te Herenga Waka—Victoria University of Wellington
Degree Level
Masters
Degree Name
Master of Science
ANZSRC Type Of Activity code
970108 Expanding Knowledge in the Information and Computing Sciences