
MULTI-LINGUAL PROSODIC TRANSPLANTATION TOOL
Contents
How
to listen to a speech signal (original or synthetic) ?
How to
zoom in a specific region ?
This program is being provided to "you", the licensee, by Fabrice Malfrère, the "author", under the following license, which applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this license. The "program", below, refers to any such program or work.
By obtaining, using and/or copying this program, you agree that you have read, understood, and will comply with these terms and conditions.
Terms and conditions for the distribution of the program
This program may not be sold or incorporated into any product which is sold without prior permission from the author.
When no charge is made, this program may be copied and distributed freely, provided that this notice is copied and distributed with it. Each time you redistribute the program (or any work based on the program), the recipient automatically receives a license from the original licenser to copy or distribute the program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this license. If you wish to incorporate the program into other free programs whose distribution conditions are different, write to the author to ask for permission.
If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this license, they do not excuse you from the conditions of this license. If you cannot distribute so as to satisfy simultaneously your obligations under this license and any other pertinent obligations, then as a consequence you may not distribute the program at all. For example, if a patent license would not permit royalty free redistribution of the program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this license would be to refrain entirely from distribution of the program.
Terms and conditions on the use of the program
Permission is granted to use this software for non-commercial, non-military purposes. In return, the author asks you to mention the MBROLIGN reference paper:
THIS SOFTWARE CARRIES NO WARRANTY, EXPRESSED OR IMPLIED. THE USER ASSUMES ALL RISKS, KNOWN OR UNKNOWN, DIRECT OR INDIRECT, WHICH INVOLVE THIS SOFTWARE IN ANY WAY. IN PARTICULAR, THE AUTHOR DOES NOT TAKE ANY COMMITMENT IN VIEW OF ANY POSSIBLE THIRD PARTY RIGHTS.
MBROLIGN v.1.0. is a prosody transplantation tool based on the use of the MBROLA speech synthesizer (http://tcts.fpms.ac.be/synthesis/mbrola.html).
It takes a .wav (Windows format) sound file sampled at 16 kHz (16 bits) and its phonetic transcription (SAMPA alphabet) as inputs. The system performs a temporal alignment of the phonetic transcription on the speech signal (the .wav file) and generates a .pho file which can be used with MBROLA to produce natural sounding synthetic speech. The alignment approach is described in "An Alignment System for Prosodic Parameter Extraction of a French Text" (Malfrère & Dutoit, 1997) (see References).
A more detailed description of the MBROLIGN v.1.0. is given in "How to use MBROLIGN ?"
The distribution of MBROLIGN v.1.0. contains the following files:
NEW : UPDATED version MBROLIGN v.1.1. is designed for Windows95, NT, 2000 and XP.
![]()
First of all, select the speech file you want to align (File è Open), next select the Mbrolign option in the main menu to perform the alignment. If there is no phonetic transcription (.txt or .pho associated with the .wav file) then the system asks you to insert the phonetic transcription (See How to create or modify the phonetic transcription ?). When the transcription is inserted, the system first determine the fundamental frequency curve by an autocorrelation method (a more precise algorithm (MBE) is available in the commercial version). This curve is then stylized with a linear piece-wise (other method are available for the commercial version). The segmentation step is then perform following the approach described in "High Quality Speech Synthesis for Phonetic Speech Segmentation". The results of the alignment will then appear on the screen.
![]()
To create the synthetic speech, choose Synthesize in the main menu.
Finally, the main frame of MBROLIGN will look like this:

From the top to the bottom: the original speech signal, the phonetic transcription time aligned with the speech, the corresponding synthetic speech signal and the pitch curve (in red, the curve at the output of the pitch extractor; in blue, the stylized curve).
How to modify the phonetic alignment and the pitch curve ?
When your alignment is done, you can correct the mistake by using your mouse.
Modify the phonetic alignment:
To modify the phonetic alignment, drag and drop the limit you want to move (right button). To insert a new phonetic label, double click on the right button between two existing labels. A dialog box will appear (with the default phoneme _) to insert the new label. To modify the name of a label, double click on the right button near the desired label, the dialog box will then appear with the label to change. Finally to delete a label, push on Delete in the dialog box.
Insertion of a new label (default _ )
Modification or deletion of a label (d in this case)
Modify the stylized pitch curve:
Only the stylized pitch curve (in blue) could be modified. To modify this curve, place the mouse pointer in the pitch curve frame where you want to modify the curve and press and maintain the left button to select and move the curve. Release the button and the curve is fixed.
How to create or modify a phonetic transcription ?
![]()
To create or to modify the phonetic transcription of the speech signal, select the Transcription of the main menu and the phonetic transcription box will appear.
Phonetic transcription dialog box (no initial transcription)
Phonetic transcription dialog box (with initial transcription)
To edit, modify or create a phonetic transcription, a copy/paste approach can also be used.
How to listen to a speech signal (original or synthetic) ?
To listen to the original or the synthetic speech file, place the mouse pointer into the speech signal frame you want to listen to, select the part of speech you want to listen with the right button of the mouse: push and maintain the right button of the mouse (selection of the beginning of the region), move the mouse in the windows (selection of the region) and release the button to listen the selected region. Double click on the right button to listen to the part of speech displayed.
How to zoom in a specific region ?
The main part of the interface of MBROLIGN is composed of four windows:
To zoom out the full signal, double click with the left button of the mouse in any of the windows (original speech, phonetic transcription, synthetic speech or pitch curve).
How to change the diphones database ?
![]()
To select another diphone database, use the Database option in the main menu. The Database Dialog box will appear and let you make a choice among the databases listed in the mbrolign.ini file.

Acknowledgments
I would like to thank Thierry Dutoit for his support and his interest during the development of this Text-to-Speech alignment system.
I would also thank Vincent Pagel, Michel Bagein and Alain Ruelle for their comments about the MBROLIGN interface.
I am also grateful to all the members of the TCTS Lab of Faculté Polytechnique de Mons who tried and evaluated the MBROLIGN. Special thanks to Olivier Deroo and Vincent Pagel for their intensive testing.
Finally, I thank the FRIA (Fonds pour la Formation à la Recherche dans
l'Industrie et dans l'Agricuture) for its financial support.
To subscribe to the mbrolign mailing list send a email to mbrolign-request@tcts.fpms.ac.be with subscribe as subject.
To send a message to the list: mbrolign@tcts.fpms.ac.be
Fabrice Malfrère
Faculté Polytechnique de Mons,
Circuit Theory and Signal Processing Lab,
31, Boulevard Dolez, B-7000 Mons, Belgium.
Tel: + 32.65.37.41.33
Fax: + 32.65.37.41.29
WWW: http://tcts.fpms.ac.be/~malfrere
E-mail: malfrere@tcts.fpms.ac.be, for general information, questions on the installation of the software.