Accent conversion

Learners of a second language practice their pronunciation by listening to and imitating utterances from native speakers. Recent research has shown that choosing a well-matched native speaker to imitate can have a positive impact on pronunciation training. Towards this goal, we are developing speech-modification techniques that can generate utterances with the vocal properties of the learner and the accent of a native speaker. This is accomplished by altering both prosodic and segmental characteristics of speech.

Our articulatory-based accent conversion is a two-step process. In the first stage, we build an articulatory synthesizer for the nonnative learner. In the next step, we drive the synthesizer with articulatory gestures recorded from a native speaker.

We have also developed an accent conversion method that relies exclusively on acoustic information.  The technique is based on the standard voice conversion model but uses a different pairing of source-target frames. Unlike conventional voice conversion, where the source-target mapping is trained on time-aligned source and target spectral vectors from parallel utterances, in our approach the mapping is trained on pairs selected based on their acoustic similarity following vocal tract length normalization.

(a) Conventional approach to voice conversion; source and target utterances are paired based on their ordering in a forced-aligned parallel corpus. (b) Our approach to accent conversion: source and target utterances are paired based on their acoustic similarity following vocal-tract-length normalization (VTLN). MCD: Mel Cepstral Distortion

2018

Levis, J; Chukharev-Hudilainen, E; Gutierrez-Osuna, R; Lucic, I; Silpachai, A; Sonsaat, S

Golden Speaker: Learner Experience with Computer-assisted Pronunciation Practice Inproceedings Forthcoming

Proc. Pronunciation in Second Language Learning and Teaching Conference, Forthcoming.

BibTeX

Ding, S; Liberatore, C; Gutierrez-Osuna, R

Learning Structured Dictionaries for Exemplar-based Voice Conversion Inproceedings

Proc. Interspeech, 2018.

Links | BibTeX

Zhao, G; Sonsaat, S; Silpachai, A; Lucic, I; Chukharev-Hudilainen, E; Levis, J; Gutierrez-Osuna, R

L2-ARCTIC: A Non-Native English Speech Corpus Inproceedings Forthcoming

Proc. Interspeech, Forthcoming.

Links | BibTeX

Ding, S; Zhao, G; Liberatore, C; Gutierrez-Osuna, R

Improving Sparse Representations in Exemplar-Based Voice Conversion with a Phoneme-Selective Objective Function Inproceedings

Proc. Interspeech, 2018.

Links | BibTeX

Zhao, G; Sonsaat, S; Levis, J; Chukharev-Hudilainen, E; Gutierrez-Osuna, R

Accent conversion using phonetic posteriorgrams Inproceedings

Proc. ICASSP, 2018.

Links | BibTeX

Liberatore, C; Zhao, G; Gutierrez-Osuna, R

Voice Conversion through Residual Warping in a Sparse, Anchor-Based Representation of Speech Inproceedings

Proc. ICASSP, 2018.

Abstract | Links | BibTeX

2016

Aryal, S; Gutierrez-Osuna, R

Comparing Articulatory and Acoustic Strategies for Reducing Non-Native Accents Inproceedings

Proc. Interspeech, 2016.

Links | BibTeX

Aryal, S; Gutierrez-Osuna, R

Data driven articulatory synthesis with deep neural networks Journal Article

Computer Speech and Language, 36 , pp. 260-273, 2016.

Links | BibTeX

2015

Aryal, S; Gutierrez-Osuna, R

Articulatory-based conversion of foreign accents with deep neural networks Inproceedings

Proc. Interspeech, pp. 3385-3389, 2015.

Links | BibTeX

Liberatore, C; Aryal, S; Wang, Z; Polsley, S; Gutierrez-Osuna, R

SABR: Sparse, Anchor-Based Representation of the Speech Signal Inproceedings

Proc. Interspeech 2015, pp. 608-612, 2015.

Abstract | Links | BibTeX

Liberatore, C; Gutierrez-Osuna, R

Joint Optimization of Anatomical and Gestural Parameters in a Physical Vocal Tract Model Inproceedings

ICASSP, IEEE 2015.

Links | BibTeX

Aryal, S; Gutierrez-Osuna, R

Reduction of non-native accents through statistical parametric articulatory synthesis Journal Article

Journal of the Acoustical Society of America, 137 (1), pp. 433-446, 2015.

Links | BibTeX

2014

Aryal, S; Gutierrez-Osuna, R

Accent conversion through cross-speaker articulatory synthesis Inproceedings

Proc. 39th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 7744-7748, 2014.

Links | BibTeX

Aryal, S; Gutierrez-Osuna, R

Can voice conversion be used to reduce non-native accents Inproceedings

Proc. 39th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 7929-7933, 2014.

Links | BibTeX

Felps, D; Aryal, S; Gutierrez-Osuna, R

Normalization of articulatory data through Procrustes transformations and analysis-by-synthesis Inproceedings

Proc. 39th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 3051-3055, 2014.

Links | BibTeX

2013

Aryal, S; Felps, D; Gutierrez-Osuna, R

Foreign Accent Conversion through Voice Morphing Inproceedings

Interspeech, pp. 3077-3081, 2013.

Links | BibTeX

2012

Felps, D; Geng, C; Gutierrez-Osuna, R

Foreign accent conversion through concatenative synthesis in the articulatory domain Journal Article

IEEE Transactions on Audio, Speech and Language Processing, 2012.

Abstract | Links | BibTeX

2010

Gutierrez-Osuna, R; Felps, D

Foreign Accent Conversion through Voice Morphing Technical Report

2010.

Abstract | Links | BibTeX

Felps, D; Gutierrez-Osuna, R

Developing objective measures of foreign-accent conversion Journal Article

Audio, Speech, and Language Processing, IEEE Transactions on, 18 (5), pp. 1030–1040, 2010.

Abstract | Links | BibTeX

Felps, D; Geng, C; Berger, M; Richmond, K; Gutierrez-Osuna, R

Relying on critical articulators to estimate vocal tract spectra in an articulatory-acoustic database Conference

Interspeech, 2010.

Abstract | Links | BibTeX

2009

Felps, D; Bortfeld, H; Gutierrez-Osuna, R

Foreign accent conversion in computer assisted pronunciation training Journal Article

Speech communication, 51 (10), pp. 920–932, 2009.

Abstract | Links | BibTeX

2008

Felps, D; Bortfeld, H; Gutierrez-Osuna, R

Prosodic and segmental factors in foreign-accent conversion Technical Report

2008.

Abstract | Links | BibTeX