|
[Introduction]
[People]
[Publications]
[Contact]
|
Current research on speech recognition covers speaker-independent
isolated word recognition, large vocabulary continuous speech
recognition based on keyword spotting, speaker adaptation, and robust
speech recognition using multi-band models (patent pending), in the
framework of the THISL, RESPITE, SPRACH, and LKIT research projects.
Our
twenty years of experience in this area is materialized in the STRUT
Speech Recognition software toolkit, which includes hybrid HMM/ANN
(Hidden Markov Models/Artificial Neural Networks) speech modeling, for
which we are recognized as one of the major research units worldwide.
Applications: speech recognition systems for embedded applications (car
navigation, consumer products), computer-assisted language learning, multimodal interfaces.
See "A
short introduction to speech recognition" for further details about the
ASR technology.
Check the speech
recognition part of comp.speech for interesting links to ASR sites. Research
areas
Through several European and national projects, but also in
collaboration with the MULTITEL ASBL
research center, the ASR team addresses many aspects of the speech recognition
problem:
-
Hybrid HMM/ANN:
combining hidden Markov models with artificial neural networks is a powerful
alternative to classical stochastic models. This technique has been
extensively studied and is now used as the baseline approach of our ASR
systems.
-
Software engineering:
over the last years we have built the STRUT
toolkit for the fast development of ASR applications. Based on the plug
and play programming philosophy, this software library implements many
ASR related algorithms: signal processing, feature extraction, GMM, ANN,
Viterbi decoding, state alignment, ...
-
Robust speech recognition:
through several projects (RESPITE, MODIVOC,
AURORA)
we have deeply investigated the problem of robust speech recognition:
spectral subtraction, wiener filtering, noise estimation, multi-band,
mixture of experts, missing data, microphone arrays, ...
-
Keyword spotting:
the problem of keyword spotting is crucial in real life application in order
to partly handle spontaneous speech in man-machine dialogues. Next to
keyword spotting, the problem of rejection of out of vocabulary words or
poorly recognized word, has also been studied through the estimation of
relevant confidence levels.
-
Model adaptation:
fast model adaptation allows to update the statistical models with very few
data. Noise adaptation and speaker adaptation have been studied, more
particularly in the framework of the hybrid HMM/ANN systems.
-
Automatic phonetization
-
Microphone arrays
-
Dialogue management
-
Distributed speech recognition:
adapting voice interface to mobile environments imposes many constraints on
the hardware capacities. Distributing the speech recognition processing
consists in achieving a part (as light as possible) of this processing on
the mobile device, transmitting some internal data representation and
performing the rest of the recognition on fixed servers. This approach
allows to control the CPU load on the mobiles and the useful bandwidth while
guaranteeing the ASR performance. This problem has been adressed through the
MODIVOC and AURORA
projects.
-
Embedded speech recognition.
^ Top ^
|
Ongoing projects
MAGE / pHTS
2010 - 2013
PhD Thesis Maria Astrinaki
DiYSE
2009 - 2011
Do-it-Yourself Smart Experiences
MediaTIC
2008 - 2015
MediaTIC
COST 2102
2007 - 2011
COST 2102
Edutain
2004 - 2008
Edutain
MAIS
2004 - 2007
Mobile Access Information System
STRUT
1996 - 2000
Speech Training and Recognition Unified Tool
Former projects
KWS Predict
2007 - 2008
KWS Prediction
IRMA
2005 - 2008
Multimodal Search Interface for Audiovisual content
IC&C
2004 - 2006
Interface Créative & Conception
DOMINI
2004 - 2006
DOMINI
MODIVOC
2002 - 2004
Systèmes MObiles et DIstribués à interface VOCale
COST 278
2001 - 2008
Spoken Language Interaction in Telecommunication
ARTHUR
2000 - 2003
ARchitecture de Télécommunication Hospitalière pour les services d'Urgence
DIALOGUE
2000 - 2004
PhD Thesis Olivier Pietquin
CONFIDENCE
2000 - 2004
PhD Thesis Erhan Mengusoglu
RESPITE
1999 - 2002
REcognition of Speech by Partial Information Techniques
DEMOSTHENES
1998 - 1999
THISL
1997 - 2000
THematic Indexing of Spoken Language
SPRACH
1995 - 1998
SPeech Recognition Algorithms for Connectionist Hybrids
COST 250
1995 - 2000
Automatic Speaker Recognition over the Telephone Network
COST 249
1994 - 2000
Continuous Speech Recognition Over the Telephone Network
OOBP
1994 - 2005
Object-Oriented Block Processing
HIMARNNET
1993 - 1995
HIdden MARkov models and Neural NETworks
|
|