The EULER Project
"It would be a considerable invention indeed, that of
a machine able to mimic speech, with its sounds and articulations.
I think it is not impossible."


Leonhard Euler (1761)

NEW !!! Euler 2.0 beta ...

EULER is a collaborative R&D project set up by the Speech synthesis research group of the Faculté Poytechnique de Mons. Its objective is to provide a freely available, easy-to-use, and easy-to-extend, generic multilingual TTS for Windows95/NT, Mac-OS, and UNIX which will progressively integrate results of existing multilingual language and speech processing projects.

Why do we need EULER ?

Private and public research laboratories (from universities to telecommunication operators) have invested considerable resources in trying to design multilingual synthesizers. In most cases, however, this non-coordinated research effort has led to unavoidable cross-system incompatibility: due to an obvious lack of unified, extensible, widely-accepted and publicly available tools and databases for TTS system development, each and every synthesizer is a laboratory-specific implementation of very similar basic principles. This, in turn, has resulted in cross-language incompatibility : most multilingual TTS systems are merely collections of monolingual ones independently developed in native research labs. Not only this situation has had a negative impact on the extensibility of available TTS systems to new languages, dialects, accents, voices, and speaking styles, but it also has hampered their integration into real products (by SMEs or large telecom enterprises): instead of incrementally refining a common, general-purpose synthesizer and providing it with high-quality interfaces for real-world applications (for handling complex text documents, for instance), research labs waste time re-implementing the wheel. Last but not least, the lack of a common backbone for TTS systems has made it very difficult to compare their quality on a module-by-module basis, thereby strongly restraining the spreading of improvements.

In contrast with this situation, state-of-the-art tools and databases for multilingual TTS system development have been recently and independently made available by several public research labs in Europe, as for instance:

  • The Faculte Polytechnique de Mons (FPMs), BE, has recently taken an important step towards developing high-quality, multilingual phonetics-to-speech synthesizers, in the form of the MBROLA project. The aim of this internet project is to foster international collaborations so as to obtain a set of MBROLA speech synthesizers for as many languages (including dialects) and voices as possible, free for use in non-commercial applications. Lots of languages are already available.
  • The University of Edinburgh (UED), GB, has recently made a major contribution to the development of freely available, high quality TTS synthesis for non-commercial purposes. Their freely available FESTIVAL Speech Synthesis system, is nothing less than a generic, modular, portable and extensible TTS system which lays the foundations of truly global TTS research and development. FESTIVAL has clearly been designed in a multilingual perspective.
  • The Universite de Provence (UP), FR, is coordinator of the MULTEXTseries of projects (LRE-MULTEXT, MULTEXT-EAST, MULTEXT-SW, MULTEXT-CATALOC, ALAF, etc.), the aim of which is to develop tools, corpora, and linguistic resources for a wide variety of languages. All these tools and resources are made freely and publicly available for non commercial, non-military purposes.

As a result of the availability of such tools, many developers have produced various TTS modules/lexica/databases, but few have made them publicly available. EULER is targetted as a means of making such tools available for non-commercial applications only.

DISTRIBUTION

In order to reach its goals, EULER comes as an auto-installable package composed of a series of modules compiled as dynamically linked libraries (DLL). Each module can be adapted to a very easily specific language by using ad-hoc language-dependent databases, the format of which has been chosen maximally compatible with existing standards. Modules are connected to a central multi-level container (MLC) and , which called sequentially as specified in a simple run-time script. Sample EULER-based TTS applications are also provided.

The MLC is the backbone of the TTS. It has been designed with top software engineering methodologies, and can be seen as an extension of the C++ Standard Template Library (STL) for handling multi-level data structures. It is provided as GNU C++ source code, with a complete EULER programmer's guide. This makes it possible for third party developers to create their own TTS modules for EULER (or create interface between their modules and the MLC) while being assured to keep total control on their work.

EULER modules are also provided as GNU C++ source code when possible. If not, they are provided as DLLs.

EULER databases are made available in a compact format, in order to protect the rights of database providers. All EULER modules, however, can read both compacted and non-compacted versions of databases, so that EULER users can easily develop (or port) their own databases and test them inside EULER, before making them publicly available in a compact format.

NB: EULER 2.0 beta executables and modules are currently available for MS Windows 95/98/NT and Linux (PC Mandrake 7.0). A Mac Port is in the works.

THE EULER MLC

Being very large software applications, including various sources of knowledge into a single code, and thereby typically requiring the collaborative work of specialists in many areas, the lifetime of TTS systems tends to be very sensitive to their implementation methodology and internal representation formalism. In that respect, given the various connections between the many levels of description of a language (phonetics, phonology, morphology, syntax, semantics, and pragmatics), and since the way they can be related to each other is seldom known in advance, it is common practice to organize the text and speech data handled by a TTS system in terms of multilevel data structure (MLDS), in which each level appears as an independent description of the sentence, synchronized with the other ones.

Advantages of MLDSs over a linear organization of information (in which everything is stored sequentially) are numerous:

  • They naturally increase readability, regarding both data and rules. Rules, for instance, can be underspecified refer only to the partial information they require and the level(s) on which they operate.
  • Extendibility is greatly enhanced. Since MLDSs also intrinsically admit unspecification, one can always specify additional layers to MLDSs to account for new analysis modules, in a way that remains transparent to previously developed ones.
  • Debugging is made easier, since information provided by distinct modules is stored at different levels. Tracing and understanding the history of data accesses and modifications is therefore facilitated.
  • Linear data structures make it hard to exploit additional linguistic knowledge that might be available at the input, as it is the case when speech synthesis is performed from machine-generated concepts (like in dialogue systems) rather than from plain text (as in full TTS systems). TTS systems based on MLDSs offer natural interfaces in all cases: synthesizing speech from concepts simply implies that the MLDS is initially filled with information on other levels than the graphemic one (such as on the part-of-speech or the accent levels).
  • Since data is made independent of rule formalisms, inter-language portability is much better ensured. No wonder, then, that the introduction of MLDSs in TTS systems have been initiated by research laboratories with multilingual concerns.

The EULER MLC is nothing else than a C++ template-based implementation of a MLDS. It can be seen as an extension of the C++ Standard Template Library (STL) for handling MLDSs.

EULER MODULES AND DATABASES

Available modules are listed below, for each TTS task. Existing language-dependent databases are mentioned in each case. Whenever possible, EULER modules use corpus-based technology, which can be easily adapted to other languages.

Task

Sub-task

Modules [Languages]

Text-to-phonetics

Pre-processing

RulesPreprocessorFr [fr,be,ch]

Lexical access and morphological analysis

RulesLemmatizer [fr]

Part-of-speech tagging

NgramTagger [fr]

Phonetization

ID3Phonetizer [fr, en, ar]
PostPhonetizerFr [fr]

Prosody Generation

FMProsodyGenerator (corpus-based) [fr, es]

Phonetics-to-speech

Diphone-based synthesis

MBROLASynthesizer [us, ar, br, bz, cr, nl, ee, fr, de, gr, ro, sp, sw]

CmdEULER and WinEULER

CmdEULER and WinEULER, the sample TTS applications, are in fact two complete, multilingual, and extensible TTS systems based on EULER modules and databases. Modules and databases are defined in a simple text script and dynamically linked with the application. This makes it easy to upgrade WinEULER, by simply downloading new modules/databases and registering them in the script.

CmdEULER is a command line application, while WinEULER is an interactive, Windows-based, application.

The EULER script makes it possible to define several TTS systems (for using several languages, or several voices, or several speaking styles), by simply assigning a name to each TTS configuration, and declaring which modules and databases is uses. At run-time, WinEULER users can easily switch between TTS configurations.

CmdEULER and WinEULER are provided as executables, with GNU C+ source code. This makes it possible for EULER users to quickly build their own TTS application, using either CmdEULER or WinEULER as a template.


Last updated November 10, 2000, send comments to dutoit@tcts.fpms.ac.be