Ongoing R&D projects on TTS synthesis
The goal of the MBROLA project is to obtain a set a high quality speech synthesizers for as many languages as possible, free for use in non-commercial applications. The ultimate goal is to boost up academic research on speech synthesis, and particularly on prosody generation, known as one of the biggest challenges in Text-to-Speech Synthesis for the years to come. Brazilian Portuguese, Breton, British English, Dutch, French, German, Romanian, Spanish, and Swedish already available as full software multilingual speech synthesizers (i.e., the DSP part of a TTS system). Many other languages are in preparation.
For years, non-coordinated research effort on the design of text-to-speech (TTS) systems has led to unavoidable cross-system and cross-language incompatibility. In contrast, the EULER project aims at producing a unified, extensible, and publicly available research, development and production environment for multilingual TTS synthesis. This project follows the steps recently made by projects like :
- MBROLA, a state-of-the-art, multilingual speech synthesis technique
- MULTEXT, multilingual texts tools and linguistic databases
- FESTIVAL, a generic, modular, portable and extensible TTS system
MBROLIGN is a fast MBROLA-based text-to-speech aligner, and is provided free for use in non commercial applications. The ultimate goal of this project is to create large phonetically and prosodically labeled for as many languages as possible, thereby drastically expanding the reach of speech technology. MBROLIGN is also targeted for ultra-low bit-rate speech transmission
MBRDICO is a talking dictionnary using MBROLA as a back-end speech synthesizer. Text processing is performed using a complete GNU GPL package for automatic phonetization training (letter/phoneme alignement, decision tree building, stress assignment) and duration/intonation generation. French, US English, and Arabic are available. This work is the result of a collaboration between:
- Faculté Polytechnique de Mons
- Carnegie Mellon University
- University of Edinburgh
Contributors are welcome to enlarge this set.
The W project aims at creating a fast computer keyboard driver for people with speech disabilities. The related software is based on grade II Braille languages developed by blind people associations all over the world and minimizes the number of keystrokes to utter a word (the name of the project is the grade II abreviation for "word" in English). In combination with the MBROLA project, W could put words in the mouth of thousands of users, in tens of languages.
(This project benefits from the kind collaboration of O. Platteau)
LIPSS was the first TTS system built in the lab. It has recently been turned into a special instance of the EULER TTS system. Part-of-speech tagging and prosodic phrasing are based on a rule-based syntactic-prosodic parser, Vertex.
LIPSS is now being assessed and compared to other systems in the framework of the Action de Recherche Concertée " Linguistique informatique et corpus oraux : Synthèse vocale" organized by AUPELF-UREF, in collaboration with :
- Department of Linguistics, K-U-Leuven, B
- LAIP , Université de Lausanne, CH
- Département de linguistique , Université de Genève, CH
- INRS - Telecom, Quebec, CA
- Institut de la Communication parlée, Grenoble, FR
- LIMSI, Paris, FR
The IWERF project aims at creating an information server integrating fax and vocal technologies. Users call a telephone-based server and ask for information by pronouncing a sequence of keywords. A phoneme-based, speaker-independent speech recognition engine decodes keywords, echoes them via a speech synthesizer (MBROLA) and gives a vocal summary of the information it can provide for the list of keywords provided. Users select documents, which are automatically faxed to them.
OOBP, which stands for Object Oriented Block Programming, is a programming paradigm developed at TCTS Lab since 1994. It is defined as Object Oriented Programming around processes and combines OOP and block descriptions. Plug and Play Software extends OOBP by defining input and output data as abstract streams. This research is done in collaboration with AT&T Labs - Research.
The goal of this program is to transcribe a symbolic input, i.e. a string of symbols belonging to some alphabet, into a symbolic output according to a regular grammar described in terms of a system of rewriting rules. "Symbols" and "alphabet" have to be understood here as generic terms: they can be characters, phonemes, syllables, words, phrases, etc.
Last updated December 17, 1999, send comments to firstname.lastname@example.org