The NU-MBROLA Project

The NUMBROLA Project has been initialized as a Non-uniform Unit Mbrola concatenator project in September 2001. It is a re-implementation of Mbrola algorithm for concatenating various length units without phonetic definitions. Below, you will find synthesis examples for this conventional-numbrola concatenator. Due to the diphone-based-like quality of synthetic speech, further research has been conducted resulting in better results(listen to examples here). In December 2002, the project has been re-defined as a larger research project to include voice quality modifications(The NUMBROLA-VQUAL Project).The resulting binaries will be shared at this web page at the end of the project, at the moment only the first version is available.

GETTING NUMBROLA BINARY

CONVENTIONAL NU-MBROLA synthesis demos

A comparison between NU-MBROLA, Diphone-based (MBROLA) and copy-paste synthesis.

Speech to speech examples(LeSoir97-French Male)

The speech-to-speech synthesis examples are obtained by considering a complete sentence as a single segment to be processed by NUMBROLA. The resulting synthetic speech shows the relatively small degradation introduced by the underlying MBROLA synthesis when prosody is not modified, or when overall prosodic modifications are applied (as opposed to dynamic prosody

Original speech file / diphone-based synthesis(MBROLA) / copy synthesis (NU-MBROLA)/ copy synthesis (NU-MBROLA) with overall prosody modification.

Original

Diphone-based

NU-MBROLA

NU-MBROLA low pitch ( 0.6 )

NU-MBROLA high pitch (1.5)

lnb01.wav

lnb01.pho.wav

lnb01.nuu.wav

lnb01.low.wav

lnb01.high.wav

lnb02.wav

lnb02.pho.wav

lnb02.nuu.wav

lnb02.low.wav

lnb02.high.wav

lnb03.wav

lnb03.pho.wav

lnb03.nuu.wav

lnb03.low.wav

lnb03.high.wav

lnb04.wav

lnb04.pho.wav

lnb04.nuu.wav

lnb04.low.wav

lnb04.high.wav

lnb05.wav

lnb05.pho.wav

lnb05.nuu.wav

lnb05.low.wav

lnb05.high.wav


 

Non-uniform unit selection examples(LeSoir97-French Male)

These are examples of speech synthesis based on the concatenaiton of non-uniform units. Minimal units are half-phonemes taken from the LeSoir97 corpus. Selection is based on the joint minimization of a target cost (taking phonetic and prosodic context into account) and a concatenation cost (taking only pitch mismatches into account). Copy-paste synthesis is obtained by raw concatenation of units. NU-MBROLA synthesis applies pitch and duration modifications on units(applying target prosody computed by NLP), as well as smoothing in the time domain.

Copy-paste-ola synthesis / diphone-based synthesis (MBROLA)/ corpus based synthesis (NU-MBROLA) /corpus based synthesis (NU-MBROLA) with overall prosody modification.

Copy-paste-ola

NU-MBROLA

NU-MBROLA low pitch ( 0.6 )

NU-MBROLA high pitch (1.5)

lnb01e.cpy.wav

lnb01e.nuu.wav

lnb01e.low.wav

lnb01e.high.wav

lnb02e.cpy.wav

lnb02e.nuu.wav

lnb02e.low.wav

lnb02e.high.wav

lnb03e.cpy.wav

lnb03e.nuu.wav

lnb03e.low.wav

lnb03e.high.wav

lnb04e.cpy.wav

lnb04e.nuu.wav

lnb04e.low.wav

lnb04e.high.wav

lnb05e.cpy.wav

lnb05e.nuu.wav

lnb05e.low.wav

lnb05e.high.wav

Speech to speech examples(Marie-French Female: Courtesy of Babel Technologies)

Original speech file / copy synthesis (NU-MBROLA)/ copy synthesis (NU-MBROLA) with overall prosody modification.

Original

NU-MBROLA

NU-MBROLA low pitch ( 0.6 )

NU-MBROLA high pitch (1.5)

1a.wav

1a.nuu.wav

1a.low.wav

1a.high.wav

1b.wav

1b.nuu.wav

1b.low.wav

1b.high.wav

1c.wav

1c.nuu.wav

1c.low.wav

1c.high.wav

1d.wav

1d.nuu.wav

1d.low.wav

1d.high.wav

Non-uniform unit selection examples(Marie-French Female: Courtesy of Babel Technologies)

Copy-paste-ola synthesis / corpus based synthesis (NU-MBROLA) /corpus based synthesis (NU-MBROLA) with overall prosody modification.

Copy-paste-ola

NU-MBROLA

NU-MBROLA low pitch ( 0.6 )

NU-MBROLA high pitch (1.5)

sent1.cpy.wav

sent1.nuu.wav

sent1.low.wav

sent1.high.wav

sent2.cpy.wav

sent2.nuu.wav

sent2.low.wav

sent2.high.wav

sent3.cpy.wav

sent3.nuu.wav

sent3.low.wav

sent3.high.wav

sent4.cpy.wav

sent4.nuu.wav

sent4.low.wav

sent4.high.wav

Last updated July 11, 2001, send comments to dutoit@tcts.fpms.ac.be