Poster
in
Affinity Workshop: Black in AI Workshop
Assisted Learning of New Languages via User Defined and Phoneme Parameterized Pronunciations
Siddha Ganju · Steven Dalton
The ability to notice mispronunciations is a key skill for second language learners. Unfortunately, it usually difficult for learners to acquire consistent feedback concerning their current level of speaking and listening skill. This issue is exacerbated by the fact that standard systems usually accept one correct pronunciation, but humans can understand a larger range of pronunciations which include mispronunciations. Current ASR systems are able to recognize speech almost perfectly, but often don't work when the speaker is not native to that language, i.e. is not spot on with their pronunciation, and more importantly, they don't provide feedback to the user on how to move from the mispronunciation to a more understandable pronunciation, when detected. We propose an approach to detect mispronunciations using a Siamese Network that is trained to recognize not a single correct pronunciation but instead a range of pronunciations that are user defined. The user can control the range of tolerance within which a word is understandable and is based on phoneme pronunciations. As the user interacts, they can adjust the range of tolerance based on their current need.