Machine Learning for Tarakihi Fish Length Estimation in Aotearoa
Current practices for monitoring the catch of deep sea fishing vessels is labour intensive requiring a person on vessel measuring individual fish lengths manually. Capturing videos of fish on-vessel instead allows the use of machine learning algorithms for tackling a computer vision based problem to automate the collection of morphological data of the observed fish.
In this thesis we investigate essential methods required and develop a system that uses machine learning algorithms and computer vision techniques to calculate centimetre accurate lengths of singulated fish from video footage. The first stage in this process is the data acquisition, where we explore the use of both a fixed camera and a free camera (one that is held in hand) for gathering the video data from which we extract millimeter lengths. Lengths were gathered on-site to compare the lengths found from images at different orientations and translations. An analysis of the different camera positions and rotations found that a camera positioned above the object of interest and calibration pattern was able to achieve the most accurate lengths. Rotation was found to have an increasingly detrimental effect on predicted lengths as rotation, measured in radians, increased.
Secondly, we use binary masks that are created both manually, and by using an automated approach, based on edge detection, for training a segmentation model to identify fish in images. We leverage shape features and interpretable ML classifier to analyse the features of contours from both inferred masks and those derived from an edge detection. In our analysis of these shape features, we identify a range of values for the feature ”circle deviation” which may be used to identify potential fish contours that did not pass the classification, and flag such contours for further training. We use contours, derived from the edge detection approach, that do pass the classification for creating a dataset of cropped images, to train a GAN from which a synthetic imagery set is created.
Thirdly, we develop a method from extracting the lengths of fish from images by using a checkerboard pattern as a point of reference, to relate pixel lengths to millimeters. Only inferred contours with shape features within the range identified by our analysis are used in calculating lengths. This approach reduced the number of partially visible fish or false positive inferences from affecting our recorded lengths.
Finally, the performance of models trained on real images, synthetic images, and a combination of the two are compared. A model that was trained on both real and synthetic images achieved an average for the absolute differences between true and predicted lengths of below one centimetre over 128 samples. Our results suggest that the use of synthetic data to assist in the creation of a robust training dataset is viable. However, this synthetic data works best when there is also real data available in the training set.