Deep Learning-based Image Analysis for High-content Screening

Zeng, Dylon

doi:10.26686/wgtn.16892482.v1

thesis_access.pdf (3.01 MB)

Deep Learning-based Image Analysis for High-content Screening

thesis

posted on 2021-10-27, 23:42 authored by Zeng, Dylon

High-content screening is an empirical strategy in drug discovery toidentify substances capable of altering cellular phenotype — the set ofobservable characteristics of a cell — in a desired way. Throughout thepast two decades, high-content screening has gathered significant attentionfrom academia and the pharmaceutical industry. However, imageanalysis remains a considerable hindrance to the widespread applicationof high-content screening. Standard image analysis relies on feature engineeringand suffers from inherent drawbacks such as the dependence onannotated inputs. There is an urging need for reliable and more efficientmethods to cope with increasingly large amounts of data produced.

This thesis centres around the design and implementation of a deeplearning-based image analysis pipeline for high-content screening. Theend goal is to identify and cluster hit compounds that significantly alterthe phenotype of a cell. The proposed pipeline replaces feature engineeringwith a k-nearest neighbour-based similarity analysis. In addition, featureextraction using convolutional autoencoders is applied to reduce thenegative effects of noise on hit selection. As a result, the feature engineeringprocess is circumvented. A novel similarity measure is developed tofacilitate similarity analysis. Moreover, we combine deep learning withstatistical modelling to achieve optimal results. Preliminary explorationssuggest that the choice of hyperparameters have a direct impact on neuralnetwork performance. Generalised estimating equation models are usedto predict the most suitable neural network architecture for the input data.

Using the proposed pipeline, we analyse an extensive set of images acquiredfrom a series of cell-based assays examining the effect of 282 FDAapproved drugs. The analysis of this data set produces a shortlist of drugsthat can significantly alter a cell’s phenotype, then further identifies fiveclusters of the shortlisted drugs. The clustering results present groups ofexisting drugs that have the potential to be repurposed for new therapeuticuses. Furthermore, our findings align with published studies. Comparedwith other neural networks, the image analysis pipeline proposedin this thesis provides reliable and better results in a shorter time frame.

History

Copyright Date

2021-10-28

Date of Award

2021-10-28

Publisher

Te Herenga Waka—Victoria University of Wellington

Rights License

Author Retains Copyright

Degree Discipline

Statistics and Operations Research

Degree Grantor

Te Herenga Waka—Victoria University of Wellington

Degree Level

Masters

Degree Name

Master of Science

ANZSRC Type Of Activity code

3 APPLIED RESEARCH

Victoria University of Wellington Item Type

Awarded Research Masters Thesis

Language

en_NZ

Victoria University of Wellington School

School of Mathematics and Statistics

Advisors

Nguyen, Binh

Usage metrics

Keywords

machine learning general estimating equations convolutional neural network School: School of Mathematics and Statistics 080199 Artificial Intelligence and Image Processing not elsewhere classified 899999 Information and Communication Services not elsewhere classified Degree Discipline: Computer Science Degree Discipline: Statistics and Operations Research Degree Level: Masters Degree Name: Master of Science Artificial Intelligence and Image Processing not elsewhere classified

Licence

Author Retains Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Deep Learning-based Image Analysis for High-content Screening

History

Copyright Date

Date of Award

Publisher

Rights License

Degree Discipline

Degree Grantor

Degree Level

Degree Name

ANZSRC Type Of Activity code

Victoria University of Wellington Item Type

Language

Victoria University of Wellington School

Advisors

Usage metrics

Categories

Keywords

Licence

Exports