Paper
29 April 2005 Preparation of sound base for a text-to-speech synthesis system
Vladimir M. Degtyarev, Mikhail N. Gusev
Author Affiliations +
Proceedings Volume 5831, Eighth International Workshop on Nondestructive Testing and Computer Simulations in Science and Engineering; (2005) https://doi.org/10.1117/12.619703
Event: Eighth International Workshop on Nondestructive Testing and Computer Simulations in Science and Engineering, 2004, St. Petersburg, Russian Federation
Abstract
We are giving several recommendations for the choice of parameters of the sound fragments in this report. The sound fragments are components of the sound base, used in Russian speech synthesis system by a text. It isn't the secret that quality of concatenation synthesis in many respects is defined at the stage of a speaker choice and preparation of base of speaker's voice samples. Formulated recommendations are received on the basis of the statistic analysis of big amount of various types of texts and concern both separate sound fragments and their groups. Parameters of sounds were taken with the help of the automatic linguistic processor including phonetic and prosodic transcriptors. The duration, intensity and main pitch frequency of sounds in various contexts and intonational contours were analyzed. The sound base produced according to the worked out recommendations, allows to make better intelligibility and naturalness of synthetic speech due to minimization of changes of speaker's voice samples.
© (2005) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Vladimir M. Degtyarev and Mikhail N. Gusev "Preparation of sound base for a text-to-speech synthesis system", Proc. SPIE 5831, Eighth International Workshop on Nondestructive Testing and Computer Simulations in Science and Engineering, (29 April 2005); https://doi.org/10.1117/12.619703
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Radon

Explosives

Acoustics

Content addressable memory

Statistical analysis

Astatine

Computer graphics

RELATED CONTENT


Back to Top