Evolutionary Computation for the Optimisation of Skip-Connection Structures on Dense Convolutional Neural Networks
Skip-connections are a core characteristic that has allowed deep neural networks to attain state-of-the-art results on a number of tasks. Despite this, the understanding of what makes particular skip-connection structures within deep networks successful is not well understood. The overall goal of this thesis is to explore methods for the automatic discovery of novel high-performance skip-connection structures, aiming to improve the performance of DenseNet and DenseNet-BC style convolutional neural networks on the task of image classification.
This thesis performs the first in depth investigation into different lower- fidelity performance estimation techniques on DenseNet style networks with varied skip-connection structures. The results provide guidance to machine learning researchers and practitioners around which performance estimation technique configurations lead to the best generalisation to final network performance, demonstrating that performing training on a relatively small amount of training data and utilising cross entropy on holdout data as evaluation criteria correlates well to test performance, and also establish how to tune elitism to overcome the stochasticity associated with lower-fidelity performance estimation.
This thesis develops a novel Evolutionary Neural Architecture Search algorithm for the discovery of skip-connection structures on DenseNet style networks. This is the first algorithm shown to improve DenseNet style network performance on competitive image classification datasets through only the refinement of skip-connection structures, leading to networks which are approximately ˜8% smaller in terms of the number of trainable parameters while providing 0.32% better classification error rate than the baseline DenseNet on the CIFAR100 dataset. Further, discovered networks are analysed, and insights relating to prevalent skip-connection structures are presented, including the discovery that skip-connections directly from the input to the last dense block in DenseNet to the output layer appear detrimental to network performance on the CIFAR10 dataset.
This thesis develops the first Evolutionary Neural Architecture Search algorithm for the discovery of skip-connection structures on DenseNet-BC style networks. The results show that the proposed algorithm is effective in discovering skip-connection structures which improve the classification error rate of DenseNet-BC style networks on CIFAR100 by an average of 0.72%, with relatively minor increases in network size. Further, discovered networks are analysed, and insights relating to prevalent skip-connection structures are presented. These insights include the discovery that a higher number of skip-connections in earlier dense blocks appear beneficial to DenseNet-BC style networks, and that the final dense block appears more sensitive to variation in the number of skip-connections utilised by the network, with the final dense blocks utilising a smaller number of skip- connections on average while also having a reduced variance in the number of skip-connections used.
This thesis proposes a novel algorithm for the discovery of skip-connection structures on DenseNet-BC style networks utilising a novel form of weight inheritance. The proposed weight inheritance operator is the first weight inheritance operator to combine the learnt weights from parents under crossover, and the learnt weights from the wider population under mutation. This weight inheritance operation is shown to reduce the computational cost of Evolutionary Neural Architecture Search for skip-connection structures on DenseNet-BC style networks by a factor of 10, and analysis shows that it avoids key issues typically associated with weight inheritance. Specifically, it is demonstrated that, under the proposed weight inheritance method, weights introduced to networks eventually conform, on average, to the feature maps already present within the network, demonstrating that the introduction of weights to a network in this fashion does not strongly disrupt the overall learnt feature maps already present in the network.
Extending the above, this thesis develops the first multi-objective algorithms for Neural Architecture Search for skip-connection structures on DenseNet- BC style networks, seeking to discover Pareto fronts of high-performance network architectures across varying network complexities. The results show that the discovered Pareto fronts are appropriate in terms of both performance and spread. The networks forming the Pareto fronts are also assessed, and insights relating to how skip-connection structures and compression rates relate to network size in high-performance networks are presented. These insights include discoveries that smaller high-performance networks tend to avoid skip-connections directly from the inputs of dense blocks, and that having relatively low compression earlier in DenseNet- BC style networks appears to offer better performance trade-offs against network size for most networks.