NeurIPS Articulatory Synthesis of Speech and Diverse Vocal Sounds via Optimization

Poster+Demo Session
in
Workshop: Audio Imagination: NeurIPS 2024 Workshop AI-Driven Speech, Music, and Sound Generation

Articulatory Synthesis of Speech and Diverse Vocal Sounds via Optimization

Luke Mo · Manuel Cherep · Nikhil Singh · Quinn Langford · Patricia Maes

[ Abstract ] [ Project Page ]

[ OpenReview]

Sat 14 Dec 10:30 a.m. PST — noon PST

Abstract:

Articulatory synthesis seeks to replicate the human voice by modeling the physics of the vocal apparatus, offering interpretable and controllable speech production. However, such methods often require careful hand-tuning to invert acoustic signals to their articulatory parameters. We present VocalTrax, a method which performs this inversion automatically via optimizing an accelerated vocal tract model implementation. Experiments on diverse vocal datasets show significant improvements over existing methods in out-of-domain speech reconstruction, while also revealing persistent challenges in matching natural voice quality.

Chat is not available.

Poster+Demo Session in Workshop: Audio Imagination: NeurIPS 2024 Workshop AI-Driven Speech, Music, and Sound Generation

Articulatory Synthesis of Speech and Diverse Vocal Sounds via Optimization

Luke Mo · Manuel Cherep · Nikhil Singh · Quinn Langford · Patricia Maes

Poster+Demo Session
in
Workshop: Audio Imagination: NeurIPS 2024 Workshop AI-Driven Speech, Music, and Sound Generation