Machine learning tools are widely used in support of bioacoustics studies, and there are numerous publications on the applicability of convolutional neural networks (CNNs) to the automated presence-absence detection of species. However, the relation between the merit of acoustic background modelling and the recognition performance needs to be better understood. In this study, we investigated the influence of acoustic background substance on the performance of the acoustic detector of the White-lored Spinetail (Synallaxis albilora). Two detector designs were evaluated: the 152-layer ResNet with transfer learning and a purposely created CNN. We experimented with acoustic background representations trained with season-specific (dry, wet, and all-season) data and without explicit modelling to evaluate its influence on the detection performance. The detector permits monitoring of the diel behaviour and breeding time of White-lored Spinetail solely based on the changes in the vocal activity patterns. We report an advantageous performance when background modelling is used, precisely when trained with all-season data. The highest classification accuracy (84.5%) was observed for the purposely created CNN model. Our findings contribute to an improved understanding of the importance of acoustic background modelling, which is essential for increasing the performance of CNN-based species detectors.
Acoustic activity detection, bird sound recognition, computational bioacoustics, convolutional neural networks, Pantanal, transfer learning