Soundscape classification with convolutional neural networks reveals temporal and geographic patterns in ecoacoustic data

Colin A. Quinn, Patrick Burns, Gurman Gill, Shrishail Baligar, Rose L. Snyder, Leonardo Salas, Scott J. Goetz, Matthew L. Clark

Research output: Contribution to journalArticlepeer-review

12 Scopus citations


Interest in ecoacoustics has resulted in an influx of acoustic data and novel methodologies to classify and relate landscape sound activity to biodiversity and ecosystem health. However, indicators used to summarize sound and quantify the effects of disturbances on biodiversity can be inconsistent when applied across ecological gradients. This study used an acoustic dataset of 487,148 min from 746 sites collected over 4 years across Sonoma County, California, USA, by citizen scientists. We built a custom labeled dataset of soundscape components and applied a deep learning framework to test our ability to predict these soundscape components: human noise (Anthropophony), wildlife vocalizations (Biophony), weather phenomena (Geophony), Quiet periods, and microphone Interference. These soundscape components allowed us to balance predicting variation in environmental recordings and relative time to build a custom labeled dataset. We used these data to quantify soundscape patterns across space and time that could be useful for environmental planning, ecosystem conservation and restoration, and biodiversity monitoring. We describe a pre-trained convolutional neural network, fine-tuned with our sound reference data, with classification achieving an overall F0.75-score of 0.88, precision of 0.94, and recall of 0.80 across the five target soundscape components. We deployed the model to predict soundscape components for all acoustic data and assess their hourly patterns. We noted an increase in Biophony in the early morning and evening, coinciding with peak animal community vocalization (e.g., dawn chorus). Anthropophony increased during morning/daylight hours and was lowest in the evenings, coinciding with diurnal patterns in human activity. Further, we examined soundscape patterns related to geographic properties at recording sites. Anthropophony decreased with increasing distance to major roads, while Quiet increased. Biophony and Quiet were comparable to Anthropophony at more urban/developed and agriculture/barren sites, while Biophony and Quiet were significantly higher than Anthropophony at less-developed shrubland, oak woodland, and conifer forest sites. These results demonstrate that acoustic classification of broad soundscape components is possible with small datasets, and classifications can be applied to a large acoustic dataset to gain ecological knowledge.

Original languageEnglish (US)
Article number108831
JournalEcological Indicators
StatePublished - May 2022
Externally publishedYes


  • Anthropophony
  • Biophony
  • Convolutional neural network (CNN)
  • Ecoacoustics
  • Machine learning
  • Naturally quiet landscapes
  • Soundscape ecology

ASJC Scopus subject areas

  • General Decision Sciences
  • Ecology, Evolution, Behavior and Systematics
  • Ecology


Dive into the research topics of 'Soundscape classification with convolutional neural networks reveals temporal and geographic patterns in ecoacoustic data'. Together they form a unique fingerprint.

Cite this