Evolutionary Transfer Learning for Multi-objective Feature Selection in Supervised Learning
Evolutionary multi-objective optimisation (EMO) has been successfully used for feature selection that usually has two conflicting objectives, minimising the number of selected features and maximising the supervised learning performance. Furthermore, EMO algorithms leverage their sophisticated search abilities to explore the huge solution space effectively and efficiently, which is essential given the NP-hard nature of the feature selection problem. One of the limitations of most of the existing EMO-based feature selection methods is that they tend to address feature selection tasks independently. However, many real-world feature selection tasks are related and share common knowledge. Evolutionary transfer learning, which aims to capture and transfer knowledge across related learning tasks, has been used in many areas. However, it is challenging to use existing evolutionary transfer learning methods to address feature selection tasks. Firstly, many existing methods rely on the analysis of the relationships between tasks to achieve positive transfer. This is challenging in feature selection since feature selection tasks can be very high-dimension and complex. Furthermore, many existing methods rely on sharing solutions to transfer knowledge across tasks. However, in the solution evaluation of evolutionary feature selection, informative features can be across the classifiers built for the related tasks, which has not been systematically investigated.
The overall goal of this thesis is to investigate and improve the capability of evolutionary transfer learning techniques, mainly evolutionary sequential transfer learning-based methods, and evolutionary multitasking-based methods, to search for the best feature subsets for each task by transferring knowledge across tasks.
Different aspects of evolutionary multi-objective feature selection are considered in this thesis, such as the encoding schemes, the evaluation operators, and the searching operators. New encoding schemes are proposed for transforming related feature selection tasks into the same search space where knowledge can be share across tasks. New evaluation operators are proposed to evaluate the solutions of related tasks simultaneously and share knowledge across them. In this process, informative features are shared across the classifiers built for the related tasks, which can improve learning performance. Finally, new searching operators are developed to select and share useful solutions across related tasks, which can improve feature selection performance.
This thesis introduces a new evolutionary sequential transfer learning method for evolutionary multi-objective feature selection, which aims to address a target task by using the probabilistic models describing the high-order information of the solutions of the source tasks. In this study, the high-order information represents the probability of whether a feature is selected or not for a task.
Since many feature selection tasks are related and share common knowledge, capturing and transferring knowledge from the solved tasks to new tasks can improve the performance of the new tasks.
The experimental results show that the proposed method successfully improves feature selection performance on the target task by effectively and efficiently using the knowledge from the source tasks.
This thesis develops a novel evolutionary multitasking method for evolutionary multi-objective feature selection to address multiple related feature selection tasks simultaneously and share knowledge across them. In multitasking, the connections between tasks can be complex, which make it challenging to conduct knowledge transfer between each two tasks. For example, assume the number of tasks is $K$, number of the pairs of tasks is ${C}_{2}^{K}$. Furthermore, the relationships between tasks can be complex, make it challenging to analyse the relationships between tasks. To address the challenges, a searching method which selects and shares useful solutions across tasks is developed.
To achieve positive transfer across tasks, the source solutions that can improve the performance of their target task are shared across tasks.
The experimental results demonstrate the effectiveness of the proposed method in improving the performance of feature selection on multiple tasks. This thesis proposes a novel evolutionary multitasking method based on feature-sharing for evolutionary multi-objective feature selection. Since the relationships between classifiers and features can be very complex, which makes it challenging to select and share commonly informative features across the related classifiers of related feature selection tasks. To address this, a dirty-model joint feature selection method is developed to selected and share useful features across the classifiers of related tasks, which can improve supervised learning performance.
To improve the effectiveness and efficiency of supervised learning, the commonly informative features are selected and shared across tasks in the proposed method. The experimental results show that the proposed method achieves significantly better supervised learning performance than current popular evolutionary feature selection methods.
This thesis introduces an evolutionary multitasking multi-objective feature selection framework for marine chemical analysis, which selects and shares commonly informative features across the related tasks of predicting the chemical properties of fish.
Experimental results show that the proposed framework achieves a better performance on marine chemical analysis than the current popular machine learning methods by knowledge sharing between related tasks.
In conclusion, this thesis investigates evolutionary sequential transfer learning and multitasking for feature selection. Novel methods of knowledge transfer are proposed to address the challenges of evolutionary transfer learning for feature selection. Experimental results verify the effectiveness of the proposed evolutionary transfer learning methods in addressing related feature selection tasks.