Machine Learning For Tree Image Segmentation
Semantic segmentation is an important task in computer vision and image processing, especially in many real-world applications, such as medical image analysis, video surveillance, and remote sensing image processing. Remote sensing image processing is widely applied in geological surveys, forest resource management, and urban planning to improve efficiency and accuracy. This work is to perform tree semantic segmentation on remote sensing images in New Zealand, which aims to segment the trees from a remote sensing image to make it easier for future processing and analysis. The proposed method and findings from this work can be used to segment the trees in Wellington to facilitate the calculation of the city's green area and monitor changes in tree distribution over time, also, the proposed method aims for precise semantic segmentation of trees, which can be used as a reliable verification tool to ensure all trees are accurately captured by alternative instance segmentation methods. For instance, there is a method to identify individual trees within a region and classify their respective species and the proposed method of this work can be used to verify whether all trees are captured, or the proposed method can be integrated into a tree identification system to segment the tree region before identification.
Recently, Convolutional Neural Networks (CNNs) become a very popular method for image segmentation tasks, and there are many different methods have been proposed and achieved great performance. However, while solving a new real-world application task, we usually need to adjust the existing method or design a new method to achieve good performance. Hence, this thesis evaluated the common image segmentation method for the tree image segmentation task in the Wellington region, and analysed each method with qualitative and quantitative results to guide the next step of research. Furthermore, we proposed an ensemble learning method to select and create an ensemble model from promising CNN models. Finally, we attempted to manually design a CNN for the task. The overall outcome is that LinkNet achieved the best performance during evaluation with an average dice coefficient of 84.24 %, and the ensemble model achieved a dice coefficient of 86.132 %.