Acoustic individual discrimination has been demonstrated for a wide range of animal taxa. However, there has been far less scientific effort to demonstrate the effectiveness of automatic individual identification, which could greatly facilitate research, especially when data are collected via an acoustic localization system (ALS). In this study, we examine the accuracy of acoustic caller recognition in long calls (LCs) emitted by Bornean male orangutans (Pongo pygmaeus wurmbii) derived from two data-sets: the first consists of high-quality recordings taken during individual focal follows (N = 224 LCs by 14 males) and the second consists of LC recordings with variable microphone-caller distances stemming from ALS (N = 123 LCs by 10 males). The LC is a long-distance vocalization. We therefore expect that even the low-quality test-set should yield caller recognition results significantly better than by chance. Automatic individual identification was accomplished using software originally developed for human speaker recognition (i.e. the MSR identity toolbox). We obtained a 93.3% correct identification rate with high-quality recordings, and 72.23% with recordings stemming from the ALS with variable microphone-caller distances (20–420 m). These results show that automatic individual identification is possible even though the accuracy declines compared with the results of high-quality recordings due to severe signal degradations (e.g. sound attenuation, environmental noise contamination, and echo interference) with increasing distance. We therefore suggest that acoustic individual identification with speaker recognition software can be a valuable tool to apply to data obtained through an ALS, thereby facilitating field research on vocal communication.
Caller recognition, mel-frequency cepstral coefficients, Gaussian mixture model, acoustic localization system (ALS), long call, orangutan,