TTS Synthesis with the Autoregressive (LPC) model
Together with three other TTS systems based on the same diphone database, The autoregressive (LPC) concatenative synthesizer demonstrated here has been used for a general comparison of the use of speech models in the context of TTS synthesis, in :
"High Quality Text-To-Speech Synthesis : A Comparison of Four Candidate Algorithms", T. DUTOIT, Proc. ICASSP'94, Adelaide, Australia, 19-22 April 1994, vol. 1, pp. 565-568. (Postscript file of a draft version : 36 Kb).
Its general organization and parameter settings are detailed in this paper.
Demo files (16 kHz/16 bits - SUN AU format)
This French LPC TTS synthesizer is based on Shur's analysis algorithm with order 18.
IMPORTANT : It should be emphasized that, in order to test the segmental quality of this concatenation-based synthesizer independently of suprasegmental effects, we have provided it with prosodic information directly stylized from natural pronunciation of the text.
For example, "bonjour.raw" was obtained from the following input file :
_ 51 25 114
on 127 48 170
j 110 53 116
r 150 50 91
Each line contains a phoneme name, a duration (in ms), and a series (possibly none) of pitch pattern points composed of two integer numbers each : the position of the pitch pattern point within the phoneme (in % of its total duration), and the pitch value (in Hz) at this position. Hence, the first line of bonjour.pho :
_ 51 25 114
tells the synthesizer to produce a silence of 51 ms, and to put a pitch pattern point of 114 Hz at 25% of 51 ms. Pitch pattern points define a piecewise linear pitch curve.
Last updated December 17, 1999, send comments to email@example.com