Evolutionary Neural Architecture Search for Image Classification
Image classification serves as a cornerstone in the realm of computer vision, closely connected to various other vision-related tasks. Deep convolutional neural networks have surpassed traditional image processing techniques in the domain of image classification, earning them the status of the primary choice for many researchers. Although numerous effective convolutional neural networks have been proposed, their architectures tend to be highly task-specific. This implies that a change in data distribution for a specific task often necessitates an architecture design, a process that is not only labor-intensive but also prone to errors. Furthermore, the architectures need to be designed by experts who know deep learning and the specific task or domain information.
To address the shortcomings mentioned above, neural architecture search has made great progress and attracted great attention, which refers to searching for promising network architectures automatically for a given task. Evolutionary computation, capable of addressing non-differentiable problems, coping with discontinuous search spaces, and with promising global search ability, is suitable for neural architecture search, and a number of evolutionary neural architecture search (ENAS) methods have been proposed. However, most existing ENAS methods suffer from inaccurate evaluations (i.e., the evaluated fitness cannot represent the candidate's true performance) and inefficiencies, limiting their usage. Furthermore, the low explainability of deep neural networks remains a pressing concern, inhibiting their broader applications. Enhancing the evaluation reliability and efficiency of ENAS, coupled with improving network explainability, emerges as a crucial research direction.
The overall goal of this thesis is to advance effective and efficient ENAS methodologies by incorporating new search spaces, proposing novel architecture representations and new fitness evaluation methods, and developing innovative evolutionary operations. Furthermore, a method is proposed to explain network decisions in image classification tasks.
Firstly, this thesis proposes a novel particle swarm optimization-based ENAS method for image classification. Utilizing an autoencoder, the variable-length architecture representation is transformed into the fixed-length latent form, making it suitable for particle swarm optimization. Additionally, this thesis develops a novel hierarchical fitness evaluation method to assess the candidate's performance efficiently. The experimental results on three benchmark datasets confirm that the new method achieves promising results, improving both the classification accuracy and search efficiency.
Secondly, this thesis proposes a new ENAS method that incorporates a performance predictor to facilitate the convergence of the search, and a new weight inheritance mechanism to accelerate fitness evaluations. Specifically, the proposed method designs a new performance predictor to aid in the generation of high-performing offspring instead of directly predicting the candidate's fitness, and even the incorrect prediction will not harm the search. Additionally, an innovative weight-inheritance method is introduced to boost the efficiency of fitness evaluations. The empirical outcomes show that the proposed method achieves commendable classification outcomes and reduces the computational demands.
Thirdly, this thesis proposes an improved one-shot ENAS methodology characterized by its reliable fitness evaluations and high efficiency. In particular, an innovative supernet fine-tuning strategy is proposed to improve the fitness evaluation reliability, a new training strategy is adopted to improve the efficiency of supernet training, and a population initialization strategy is presented to enhance the evolutionary progress. Extensive experimental results confirm the method's superior performance and validate each introduced strategy's efficacy.
Fourthly, this thesis proposes a novel evolutionary post-hoc, model-agnostic explanatory method to explain classifiers' decisions for image classification tasks. The proposed method leverages a stable diffusion model to help generate counterfactual images, aiding in explaining the decisions. Additionally, a multi-objective evolutionary method is proposed to identify minimal but pivotal regions, and the associated features are explainable by humans. The experimental results show the method's efficacy across diverse similar classes and classification architectures.