Bioacoustics through the Time-Frequency plane
The natural world is full of sounds produced by plants, animals and even landscape elements. Animals can be incredibly creative in their sound production, and the study of their vocalisation is an essential part of behavioural and conservation ecology. Counting animals' vocalisation is often used as a survey method to monitor population abundances, especially with cryptic or nocturnal species. Once performed only by humans, in the last decades, call counts have evolved, and nowadays, Automatic Recording Units (ARUs) are often used for this purpose, and terabytes of data are collected during acoustics surveys requiring long hours of tedious work to be analysed.
As a consequence, in recent years, there has been a growing interest in using signal processing and artificial intelligence methods to speed up and facilitate this process.
Studying sounds, even animal sounds, entails studying how their frequency and intensity change in time, a piece of information not easily readable from the data type collected by ARUs: the waveform. A waveform only describes the change of amplitude with respect to time, but other analysis tools are needed to have information about the instantaneous changes in frequency and intensity.
In 1946, Koening, Dunn and Lace introduced a 3D representation of sounds, where time, frequency and intensity could be read simultaneously: the spectrogram. The spectrogram, then, became one of the main tools used in the broad field of signal processing and, more specifically, in the study of animal and natural sounds: Bioacoustics. Because the spectrogram combines the representation of a signal in both the time and frequency domain, it is called a Time-Frequency Representation (TFR) of sound.
However, during these seven decades, more TFRs were introduced to improve the spectrogram, whose performances are hindered by its limits in the time-frequency resolution. Every TFR is defined by a set of parameters that can influence its ability to depict important sound features effectively. Therefore, choosing the best TFR and the best parameters is a challenging task, and it is highly dependent on the characteristics of the sound we are studying. In this thesis, we will explore the differences between the main TFRs present in literature, and we will propose methods to test their performance in some real-world bioacoustics problems involving Aotearoa/New Zealand bats' echolocation data and North Island Brown kiwi calls data. We will also demonstrate how the choice of a TFR and its parameters is crucial if we want to obtain optimal results from our data analysis and discuss the importance of TFRs in Bioacoustics.