Marine conservation has benefited from Passive Acoustic Monitoring Systems, though processing the recorded signals remains time-consuming and resource-intensive. This paper evaluates the potential of You Only Look Once v8 (YOLOv8) to automatically detect dolphin whistles. Despite being primarily designed for detecting high-level features in real-world objects, YOLOv8 effectively identifies dolphin whistles in spectrograms. The paper details the conversion of acoustic signals into spectrograms for YOLOv8 analysis and outlines an experimental methodology in three phases. First, it compares model performance with and without additional labels on a three-hour dataset. Second, cross-dataset transferability is assessed by testing the model on historical and critical datasets. Finally, a general model is trained using all datasets to evaluate overall performance. Introducing additional labels improves the model’s average precision by 12% and recall by 19%, better distinguishing whistles from non-whistles. However, temporal transferability presents challenges, with recall dropping by 3–36% and precision improving by 3–17%. In the final phase, the model achieves a precision of 84% and recall of 70%, demonstrating YOLOv8’s potential for automated detection and labelling of dolphin whistles in spectrograms, even with small datasets.
Passive acoustic monitoring, dolphin whistles, You Only Look Once v8, deep learning, bioacoustics, automatic detection