Cetacean Translation Initiative: a roadmap to deciphering the communication of sperm whales

by   Jacob Andreas, et al.

The past decade has witnessed a groundbreaking rise of machine learning for human language analysis, with current methods capable of automatically accurately recovering various aspects of syntax and semantics - including sentence structure and grounded word meaning - from large data collections. Recent research showed the promise of such tools for analyzing acoustic communication in nonhuman species. We posit that machine learning will be the cornerstone of future collection, processing, and analysis of multimodal streams of data in animal communication studies, including bioacoustic, behavioral, biological, and environmental data. Cetaceans are unique non-human model species as they possess sophisticated acoustic communications, but utilize a very different encoding system that evolved in an aquatic rather than terrestrial medium. Sperm whales, in particular, with their highly-developed neuroanatomical features, cognitive abilities, social structures, and discrete click-based encoding make for an excellent starting point for advanced machine learning tools that can be applied to other animals in the future. This paper details a roadmap toward this goal based on currently existing technology and multidisciplinary scientific community effort. We outline the key elements required for the collection and processing of massive bioacoustic data of sperm whales, detecting their basic communication units and language-like higher-level structures, and validating these models through interactive playback experiments. The technological capabilities developed by such an undertaking are likely to yield cross-applications and advancements in broader communities investigating non-human communication and animal behavioral research.


page 26

page 27

page 28

page 29


Machine learning in acoustics: a review

Acoustic data provide scientific and engineering insights in fields rang...

Marine Mammal Species Classification using Convolutional Neural Networks and a Novel Acoustic Representation

Research into automated systems for detecting and classifying marine mam...

MEWL: Few-shot multimodal word learning with referential uncertainty

Without explicit feedback, humans can rapidly learn the meaning of words...

Large language models and (non-)linguistic recursion

Recursion is one of the hallmarks of human language. While many design f...

Prospect Theory for Human-Centric Communications

Entering the 5G/6G era, the core concept of human-centric communications...

Recovering Quantitative Models of Human Information Processing with Differentiable Architecture Search

The integration of behavioral phenomena into mechanistic models of cogni...

Please sign up or login with your details

Forgot password? Click here to reset