1.7 Million Downloads. 7,077 Voices Finally Heard.

Name: ReCANVo: Real-World Communicative and Affective Nonverbal Vocalizations
Creator: Narain, Jaya; Johnson, Kristina Teresa
Published: 2021
License: CC BY 3.0 US
Keywords: autism, nonverbal communication, assistive technology, emotion, speech

A dataset of real-world nonverbal vocalizations is teaching machines to understand the people that traditional speech technology left behind.

Narain, Jaya; Johnson, Kristina Teresa · 2021DOI: 10.5281/zenodo.5786860CC BY 3.0 USView on Zenodo →

7,077labeled real-world vocalizations

1.67Mdataset downloads+520K in last year

6,491page views from researchers

100%recorded in naturalistic settings

The sounds that carry meaning without words

For non-speaking individuals — many of them on the autism spectrum — communication does not stop at the absence of words. A hum can mean contentment. A sharp vocalization can signal pain. A rhythmic sound can express excitement. These vocalizations carry rich communicative and emotional meaning, understood by familiar caregivers and family members but invisible to every speech recognition system ever built. Jaya Narain and Kristina Teresa Johnson set out to change that.

ReCANVo is not a laboratory recording. The researchers captured 7,077 vocalizations in the places where communication actually happens: homes, schools, community settings. Participants wore recording devices during their daily routines, and each vocalization was later labeled by a communication partner who knew the individual — someone who could distinguish a sound of frustration from a sound of delight with the fluency that years of relationship provide. This labeling approach is what makes ReCANVo unprecedented: it encodes the human understanding of nonverbal communication as machine-readable ground truth.

The dataset has been downloaded 1.7 million times, a number that speaks to the enormous unmet need in assistive technology. Traditional augmentative and alternative communication (AAC) devices require deliberate input — pressing buttons, selecting symbols. ReCANVo opens the door to passive systems that listen to natural vocalizations and translate them in real time, bridging the gap between what non-speaking individuals express and what the world around them can understand. For the families and caregivers who already speak this language, the technology would be a confirmation. For everyone else, it would be a revelation.

Vocalization categories by communicative function

Distribution of labeled vocalizations across affective and communicative categories

A hum can mean contentment. A sharp cry can signal pain. These sounds carry meaning — and now machines are learning to listen.

Vocalizations labeled by familiar communication partners — not clinicians — capturing the nuanced understanding that only close relationships provide

Requests and expressions of want account for 26% of all vocalizations, underscoring how much unmet communicative need exists for non-speaking individuals

All recordings were made in natural environments — homes, schools, community settings — not controlled labs, ensuring the data reflects real communication patterns

Narain, Jaya; Johnson, Kristina Teresa

dataset · 2021 · CC BY 3.0 US

autismnonverbal communicationassistive technologyemotionspeech

View on Zenodo →

🗣️

Assistive Technology

ReCANVo enables a new class of AAC devices that listen passively to natural vocalizations rather than requiring deliberate input. For individuals who cannot operate traditional communication boards, this represents a fundamentally different — and more accessible — path to being understood.

🧠

Understanding Autism

The dataset challenges the assumption that non-speaking means non-communicating. By systematically cataloging the communicative richness of nonverbal vocalizations, ReCANVo provides empirical evidence that these sounds are structured, intentional, and meaningful — not random noise.

❤️

Family & Caregiving

Caregivers and family members have always understood these vocalizations intuitively. Technology that validates and extends this understanding could reduce caregiver burden, improve response times in care settings, and give non-speaking individuals greater autonomy in everyday interactions.