next up previous contents
Next: PROJECT MANAGEMENT Up: No Title Previous: APPROACH

PROJECT WORK PLAN

 

Introduction to the Work Plan

Overleaf we summarize, in tabular form, the seven work packages broken down into their component tasks. In short, on top of WP0 on Project Management (see also Section 4 for management structure), eight work packages have been defined:

  1. WP1: Database gathering from different sources and set up of baseline systems. In this framework, the large vocabulary, continuous speech recognizer resulting fromWERNICKE will be extended to French and Portuguese.
  2. WP2: Development of (and development tools for) lexica for multiple languages, including baseline dictionaries for new languages and automatic learning of new dictionaries.
  3. WP3: Development tools and research on different approaches to represent and adapt language models (LM), with particular focus on generality and ease to use.
  4. WP4: Development tools and research on application domain independence and adaptation, including task independency of acoustic models, and unsupervised adaptation and training of speakers and acoustic models.
  5. WP5: More fundamental research into important issues related to speech recognition in general and hybrid systems in particular, including perceptual models, global discrimination, mixture of experts, and others. It is expected that, as for WERNICKE, research into those very well defined promising research areas will lead to further enhancement of our existing systems.
  6. WP6: Development of the necessary software and hardware tools necessary to carry out the proposed work. As already shown with WERNICKE, this is particularly important in (1) reducing the research cycle and (2) forcing all the partners to work on the same software and hardware basis.
  7. WP7: Evaluations and prototypes development to regularly assess the progress of SPRACH. Building upon the WERNICKE software, it is expected that some of those demonstration systems will actually be close to real ``products''.
  8. WP8: Results dissemination and exploitation.
As shown in Section 3.4, the last 6-12 months of this project will put most of the emphasis on prototype development of WP7 (demonstrations and evaluations).

As with WERNICKE, strong interaction between all the partners and all work packages will be guaranteed through the use of the same hardware, software and (research, i.e., US English) databases. Only language specific developments (UK English, French and Portuguese) will be carried out by the respective sites.

Detailed Work Plan

Work Package WP0: Management

Work Package Manager: FPMs
Executing Partners: FPMs

See Section 4 for a description of the management structure and follow-up of the different work packages.

Although our research and development work is well defined in the proposal, a kick-off meeting (with the Industrial Advisory Board) will be organized at a very early stage of the project to guarantee synchronization and supervision and to initiate working contacts with our industrial advisors to understand their concerns and their expectations.

WP0: Milestones

M0.1
(T0+1): Kick-off meeting (with all partners and Industrial Advisory Board).
M0.x
Regular meetings as planned in Section 4

WP0: Deliverables

D0.x
Short management report every 6 months

Work Package WP1: Databases and Baseline Systems

Work Package Manager: INESC
Executing Partners: CUED, FPMs, INESC, SU

Two essential objectives of the SPRACH project are (1) to stay in the forefront of research in large vocabulary continuous speech recognition, and (2) to address the multilingual aspects of recognition with hybrid HMM/ANN systems.

In the scope of their current research, performed mainly within the WERNICKE project, all partners have developed their own systems for the Wall Street Journal database, which is the standard international database used for development and evaluation of large vocabulary, speaker independent, continuous speech recognizers.

This workpackage has three essential objectives:

  1. To follow the constant evolution of the internationally accepted databases for evaluation of continuous speech recognizers with very large vocabularies (addressed in Task T1.1).
  2. To provide the databases that are needed for the multilingual aspects of the project, namely for French and Portuguese (also addressed in Task T1.1).
  3. To develop baseline recognizers for the new languages in the project: French (Task T1.2) and Portuguese (Task T1.3).

Task T1.1: Databases

Task Coordinator: INESC
Executing Partners: CUED, FPMs, INESC, SU, ICSI

Use of a common speech database is of the greatest importance, when trying to compare the performance of different recognition systems, or to assess the advantages of modifications to some basic system. Therefore, this project will put a large effort in the use of speech data that are widely used for evaluating continuous speech recognizers all around the world. Since one of the goals of this project is to develop and assess neural network technologies on difficult continuous speech recognition tasks, and since the partners want to avoid, whenever possible, losing time in developing new databases, they will mostly work with the established databases (e.g. the ARPA CSR corpora of North American Business News). These corpora have been well-studied for conventional speech recognition systems, so that there is ample basis for comparison.

Those large speech corpora will be used to:

On top of those large speech corpora, other databases will be used, such as:

  1. Smaller databases, like databases from SAM and Oregon Graduate Institute (OGI, Portland, USA) to:
  2. Multilingual databases (coming from European and US sources) to develop and perform research on new languages (Tasks T1.2 and T1.3, and Work Package WP4). More specifically, since UK English, French and Portuguese will be the targeted European languages handled in this project, the following databases will also be considered here:
  3. The ``Translanguage English Database'' is the combined recordings of the presentations given at EuroSpeech'93. This provides a challenging task both in terms of the acoustic conditions and the wide variety of accents present.

Most of the research described in this proposal will probably be performed by all the partners on standard US English databases (to allow easy internal and external comparisons). However, although all baseline and prototype systems discussed below and in WP7 will be issued from the same decoder (NOWAY of WERNICKE):

The adaptation of the state-of-the-art baseline WERNICKE system for UK and US English to French and Portuguese mainly involves: This is what is briefly described in Tasks T1.2 and T1.3 below.

Task T1.2: Baseline System for French

Task Coordinator: FPMs
Executing Partner: FPMs

Starting from the baseline US-English large vocabulary continuous speech recognizer developed in the framework of WERNICKE, the early task of FPMs will be to develop an equivalent system in French. This will involve:

The system resulting of this Task will be used:

Task T1.3: Baseline System for Portuguese

Task Coordinator: INESC
Executing Partners: INESC

As noted earlier (Task T1.1), there is no adequate database for speaker independent, large vocabulary, continuous speech recognition in Portuguese. Such a database will be collected within Task T1.1 of this project. As a consequence a baseline system for Portuguese will have to be based on a much smaller database, to be available in useful time. We will use the Portuguese part of the SAM database for this purpose.

The SAM database, is not labeled, and there is no adequate, labeled database for Portuguese. Therefore, the development of the Portuguese baseline system will consist of the following steps:

  1. Automatic labelling of the SAM database, using a TIMIT net (MLP trained over the TIMIT database to make phoneme classification) and an approximate mapping from the TIMIT phonemes to the Portuguese ones.
  2. Training the acoustic models based on the labeling from 1).
  3. Re-labeling of the SAM database using the acoustic models from 2). Steps 2) and 3) may have to be iterated several times, to improve the acoustic models and the labeling.

This procedure will use phonetic transcriptions adapted from those of SAM (which are sentence-based, instead of word-based), and a language model extracted from the SAM corpus.

Although the above mentioned procedure has a good chance of succeeding, there is a possibility that it might fail. In such case, we will resort to an alternate path, departing from the CVC part of the SAM database.

The performance of the baseline system will be assessed, but it cannot be expected to equal those of recognizers for other languages, based on much larger speech corpora.

WP1: Milestones

M1.1
(T0+3): French database (BREF) available at FPMs
M1.2
(T0+15): Baseline French recognizer available
M1.3
(T0+24): Portuguese database available
M1.4
(T0+24): Baseline Portuguese recognizer available

WP1: Deliverables

D1.1
(T0+15): Baseline French recognizer
D1.2
(T0+24): Baseline Portuguese recognizer

Work Package WP2: Lexica and Learning of Lexica

Work Package Manager: INESC
Executing Partners: FPMs, INESC, ICSI

The goals of large vocabulary, speaker independent recognition, domain adaptation and task independence require the availability of appropriate pronunciation lexica, encompassing very large numbers of words. When porting these systems to new languages, the development of such lexica can pose a large problem. In this respect, the two new languages involved in SPRACH are in different situations. For French, there already is a significant amount of speech recognition work done, and consequently there already exist relatively large lexica, which may however have to be augmented. For Portuguese, the situation is that of an almost virgin language, in terms of speech recognition and pronunciation lexica. Development of dictionaries for the two languages will, therefore, have to follow different strategies, as outlined in the descriptions of the following tasks.

Task T2.1: Baseline Dictionaries for New Languages

Task Coordinator: INESC
Executing Partners: FPMs, INESC

The two new languages explicitly considered in this project are French and Portuguese (although some of the tools developed here should be applicable to other languages).

Regarding French, the main goal of this task is ``just'' to get access to one of the existing large French dictionary (with multiple phonetic transcriptions). The obvious candidate will be the LIMSI dictionary.

For Portuguese the situation is however different. As said above (Task T1.3), there currently exist no pronunciation dictionaries for Portuguese, that can be used for speech recognition. A baseline dictionary will be adapted from the SAM phonetic transcriptions (which are based on sentences, instead of words). This will encompass the SAM vocabulary, with about 1000 words. This dictionary will be used in the baseline system (Task T1.3) and will be expanded, by automatic means, in Task T2.2. The latter task will be in charge of automatically developing larger dictionaries. A larger dictionary will be made by hand only if unsurmountable difficulties are encountered in Task T2.2.

Task T2.2: Automatic Learning of New Dictionaries

Task Coordinator: INESC
Executing Partners: INESC, ICSI

As explained in Task T2.1, the baseline dictionary for Portuguese will be rather small. Instead of creating a large dictionary by hand, we will develop techniques to automatically add new words, with their pronunciations, to an existing dictionary, based on the phoneme sequences obtained from the recognizer. The baseline dictionary for Portuguese will be used as a starting point, to create a large dictionary for Portuguese.

We will investigate a number of ways to derive multiple pronunciations for words. We will first consider automatic extraction and representation of phonological rules, which can be used to create many pronunciations from expansions of simple canonical pronunciations [Tajchman et al., 1995a, Tajchman et al., 1995b]. Phonological rules to be considered include those modeling phonological effects due to coarticulation and fast-speech as well as dialect effects. The task will examine extracting and/or training these rules on data. The second source of pronunciation extensions will be the incorporation of network misclassifications (``loose phones'') as alternative pronunciations. [Wooters, 1993]

There is a tradeoff between the desire to have a large number of pronunciations in order to capture every possible variation and the need to have a reasonably sized lexicon for timely, efficient, training and recognition. We will examine a number of ways to find the correct point along this line. These include Bayesian merging of pronunciations in order to build more compressed, redundancy-free HMMs (Wooters and Stolcke 1993), as well as various kinds of beam-width pruning of the number of pronunciations per word.

WP2: Milestones

M2.1
(T0+6): Baseline French lexicon available.
M2.2
(T0+12): Baseline Portuguese lexicon available.
M2.3
(T0+24): Preliminary version of automatic learning of lexica available.
M2.3
(T0+36): Final version of automatic learning of lexica available.

WP2: Deliverables

D2.1
(T0+24): Preliminary report on automatic learning of lexica.
D2.2
(T0+24): Preliminary software for automatic learning of lexica.
D2.3
(T0+36): Report on automatic learning of lexica.
D2.4
(T0+36): Software for automatic learning of lexica.

Work Package WP3: Language Models (LM) and LM Adaptation

Work Package Manager: SU
Executing Partners: CUED, INESC, SU

In WERNICKE we realized that the efficient representation and use of language models (LMs) was of major importance for flexible recognizers but was unfortunately often disregarded. In consequence we plan to mount a substantial investigation of novel language modeling approaches for spoken language.

Of course, although most of the developments here will be done on US and UK English (milestones and deliverables are referring to those languages), outputs will also be exploited, when possible, for other languages.

Task T3.1: Markov Model Based Grammars

Task Coordinator: CUED
Executing Partners: CUED, SU

Markov model based grammars, such as trigram grammars, are the most commonly used statistical language model. They are relatively easy to estimate and currently provide the best performance for very large vocabulary recognition. Hence it is both necessary to incorporate this basic tool and prudent to investigate this method with the aim of producing better statistical language models. The topics that we propose to address are:

Task T3.2: Grammatical Inference

Task Coordinator: SU
Executing Partners: SU

Work on estimating Markov model based grammars may be extended into the general problem of inducing finite state automata (regular grammars) from data. The ``transformation-based'' approach to the induction of rule-based grammars (Brill, 1994) has been successful in the context of part-of-speech tagging. However, we note that the resultant grammars may be represented as a particular class of finite state automaton. We will analyze Brill's rule induction algorithm in these terms, and investigate language models based on automatically derived (stochastic) finite state automata.

Many small tasks with closed vocabularies are best represented as a rule-based context free grammar. We expect that there are gains to be made by assigning probabilities to existing context-free grammars and also through the inference of stochastic context-free grammars purely from available data. The inside-outside algorithm presents a theoretical framework for the estimation of statistical context free grammars but there are many practical problems with its implementation. Thus we need to develop practical approaches and new theory as necessary.

Task T3.3: Language Model Adaptation

Task Coordinator: CUED
Executing Partners: CUED, SU, INESC

An appropriate language model is an invaluable component any speech recognition system larger than a few isolated words. When dealing with a new task, or with a task whose characteristics change slowly in time, it would be very useful to be able to build on previous language models, instead of re-building the language model from scratch. This would have the main advantages of requiring much fewer data, possibly being done in an unsupervised way, and also of saving processing time.

WP3: Milestones

M3.1
(T0+12): Preliminary tests of different language models (LMs) on baseline system
M3.2
(T0+24): Incorporation of new LM techniques in baseline system
M3.3
(T0+36): Incorporation of LM adaptation in baseline systems

WP3: Deliverables

D3.1
(T0+12): Preliminary report on tested LMs and their performance on baseline system
D3.2
(T0+24): Report on new LM techniques
D3.3
(T0+36): Report on LM adaptation

Work Package WP4: Application Domain Adaptation

Work Package Manager: CUED
Executing Partners: CUED, FPMs, SU, INESC

Since one of the goals of this project is to have a flexible (hybrid HMM/ANN) speech recognizer that is quite independent of the initial training data and easy to adapt to new languages, several topics have to be investigated in that framework.

Task T4.1: Training Independent Tasks

Task Coordinator: FPMs
Executing Partners: FPMs, CUED

For task independency we wish our acoustic models to be independent of the acoustic environment and independent of the lexicon. Currently our base system is trained for one acoustic condition, that of noise free read speech with a known microphone. Thus the issues are:

Task T4.2: Unsupervised Adaptation and Training

Task Coordinator: INESC
Executing Partners: CUED, INESC, SU

Many forms of adaptation have been developed as part of standard HMM systems. However, given that hybrid HMM/ANN systems have tended to use context-independent phoneme models or only very broad forms of context dependence and that their recognition performance at the phoneme level has been shown to be greatly improved, it is expected that this approach should be better suited to unsupervised adaptation.

In this task, we will consider different types of adaptation, including:

WP4: Milestones

M4.1
(T0+12): Preliminary tests on training independent tasks (as agreed with Industrial Advisory Board)
M4.2
(T0+36): Tests on training independent tasks with unsupervised adaptation of speaker and/or phoneme models.

WP4: Deliverables

D4.1
(T0+12): Report on training independency recognition
D4.2
(T0+24): Report on unsupervised speaker and phoneme adaptation
D4.3
(T0+36): Final report on application domain adaptation

Note: Software deliverables of related codes are planned in WP7.

Work Package WP5: New Architectures and New Paradigms

Work Package Manager: FPMs
Executing Partners: CUED, FPMs, SU, ICSI

Thus far, hybrid HMM/ANN systems have been shown to lead to good acoustic modeling (with several advantages over other approaches, including design flexibility). WP5 will be devoted to further fundamental research in new conceptual advances in hybrid systems that have been identified by the partners. Much of this work has already been initiated in WERNICKE.

Task T5.1: Perceptual Models

Task Coordinator: FPMs
Executing Partners: FPMs, ICSI

Recently, a new recognition system based on auditory events (or speech transitions) was introduced in WERNICKE.

This system, referred to as SPAM (Stochastic Perceptual Auditory-Event-Based Models), attempts to model speech as a sequence of (non stationary) transitions, disregarding stationary portions of the signalgif. An initial theory was set up in WERNICKE and preliminary encouraging results were recently published [Morgan et al., 1995]. It is the goal of this task to pursue research in this new direction.

Task T5.2: Global Discrimination

Task Coordinator: FPMs
Executing Partners: FPMs, ICSI

Recently, a new development in hybrid HMM/ANN theory was developed in WERNICKE. This new technique, referred to as REMAP (Recursive estimation and Maximization of a Posteriori Probabilities), allows full discriminant training of HMM/ANN systems, leading to globally discriminant models [Bourlard et al., 1995]. Compare to other (hybrid as well as non hybrid) alternative approaches such as Maximum Mutual Information or Generalized Probabilistic Descent, this new approach directly maximizes global a posteriori probabilities (without the need of computing probabilities of rival models) to lead to discriminant solutions.

We believe that REMAP could be of major importance for the further enhancement of hybrid HMM/ANN systems, as well as for pattern classification systems in general.

It is the goal of this task to continue the work in this innovative area.

Task T5.3: Mixture of Experts

Task Coordinator: CUED
Executing Partners: CUED, SU

Connectionist model combination for large vocabulary recognition was developed under WERNICKE. This framework promises a principled way of combining models such as those trained under different noise or channel conditions. Thus we propose to continue this work and also extend it to the related work of hierarchical mixture of experts (HME).

Task T5.4: Incorporating Auditory Models

Task Coordinator: SU
Executing Partners: SU, FPMs, ICSI

The use of acoustic processors based on auditory modelling has been proposed as a route to robust speech recognition in noisy environments. In this proposal we plan to address some specific issues pertaining to the use of auditory models in speech recognition.

Task T5.5: Context Dependent Phone Modelling

Task Coordinator: CUED
Executing Partners: CUED

Preliminary work with context dependent phone modelling has yielded significant reductions in word error rate of 15%--25%. However, HMM systems typically show much larger gains and so we expect that there is still a lot of work to be done in this area, namely:

WP5: Deliverables

D5.1
(T0+12): Report on each of the WP5 research issues.
D5.2
(T0+24): Advanced report on WP5 research issues.
D5.3
(T0+36): Final research report.

Note: It is obvious that any research topic that will be proved promising (i.e., better than our baseline recognizer) on small reference databases will be further tested on the larger databases used in the project and included in the reference recognizer if successful. However, no explicit milestones have been defined here since this would be completely made-up and probably unrealistic.

Work Package WP6: Hardware and Software Support

Work Package Manager: FPMs
Executing Partners: CUED, FPMs, INESC, SU, ICSI

We will be incorporating new hardware and software from the same group at ICSI that provided the successful system infrastructure for WERNICKE with the RAP computers. The new systems, which will be vector-based, will be required in SPRACH to (1) support our research (e.g., fast assessment of new research results on moderate sized tasks), and (2) provide fast training on large databases. Towards these ends the ICSI team has developed a vector supercomputer on a chip called the Torrent. Software is currently being developed for this chip that is analogous to the routines currently used on the RAP. Also, in the framework of WERNICKEgif, ICSI is finalizing the building a single-processor S-bus board called SPERT that will be made available to the SPRACH partners for their research and development works. During SPRACH ICSI will also developgif multi-node versions called SPRACHSTATION that will also be made available to the SPRACH partners. The SPERT boards, which we estimate to be several times faster than the partners' RAP systems, while being a small fraction of the price, are expected to be the main workhorses for the partners. The multi-node SPRACHStation, however, is likely to be required to keep up with new computationally demanding research directions and for the large-vocabulary systems.

At this point there are a number of fast European systems for neural network training. However, the ICSI systems are unique in several respects:

Task T6.1: Hardware

Task Coordinator: FPMs
Executing Partners: ICSI

Since the SPERT hardware will be available before the end of the WERNICKE project, no funding will be required for this task which is here only as a reminder that the partners will purchase SPERT board(s) and one SPRACHSTATIONgif as soon as available from their sub-contractor ICSI.

The main effort of this WP will be related to the development of the software tools as well as software adaptation required by this project (Task T6.2).

Task T6.2: Software

Task Coordinator: FPMs
Executing Partners: ICSI

As briefly discussed in Section 2 of this proposal, ICSI will be building up the software building blocks necessary for the partners to write research software easily. In particular, they will provide feedforward and recurrent network modules that take IEEE floating point as input and provide IEEE floating point at the output, while doing arithmetic internally in fixed point for the sake of speed.

Task T6.3: Algorithms Porting

Task Coordinator: SU
Executing Partners: CUED, FPMs, INESC, SU

The baseline (WERNICKE) recognizer runs on standard Unix workstations (including SUN, HP, SGI and PC/Linux). However, the training systems require specialized high-performance hardware. In WERNICKE training was performed using the RAP hardware and software platform. The use of Torrent-based systems in SPRACH will require applications code (particularly neural network training software) to be ported from the RAP to Torrent. Although this porting effort will be reduced by the similarity of the new matrix-vector calls to the older RAP libraries, an additional effort will be required by the partners to fully exploit the architecture of SPERT and the SPRACHSTATION." This task will also cover any coding effort necessary to implement new algorithms on this hardware.

WP6: Milestones

M6.1
(T0+3): SPERT hardware available on each site.
M6.2
: (T0+3): First version of the SPERT software available on each site.
M6.3
: (T0+12) First version of training system on SPERT.

WP6: Deliverables

D6.1
(T0+3): Report on SPERT hardware and software.
D6.2
(T0+12): Training available on SPERT.
D6.3
(T0+36): Final training system on SPERT.

Work Package WP7: Evaluations and Prototypes Development

Work Package Manager: CUED
Executing Partners: CUED, FPMs, INESC, SU, ICSI

In recent years formal evaluations have provided a way of both quantifying the effect of algorithmic developments in speech recognition and disseminating these advances to the research community.

Task T7.1: Decoder Issues

Task Coordinator: SU
Executing Partners: CUED, FPMs, SU, ICSI

The NOWAY decoder developed in WERNICKE offers efficient decoding, together with a simple interface to language models of arbitrary complexity. We plan to further develop NOWAY, both algorithmically and implementationally. Algorithmic developments will include the investigation of fast look-ahead techniques to further prune the search space, and the comparison and optimization of different search strategies. The current algorithm uses a combination of best-first and breadth-first searches. It is possible that a depth-first search, ordered by language model information, may produce a significant speedup. We plan to investigate the performance of this less-time-synchronous style of search.

Implementational improvements will include improvements in memory and CPU efficiency, with the aim of achieving a single workstation demonstration system that consistently runs in real-time or less. Other improvements include methods to deal with very large vocabularies (100,000 words and larger). The basic search strategy employed in NOWAY makes it possible to deal with large vocabularies easily, since not all dictionary words need be present in the language model (i.e. the acoustic model vocabulary may be much larger than the language model vocabulary). This means that vocabulary size may be increased merely by adding to the pronunciation dictionary, without recomputing the language model.

Task T7.2: Evaluations

Task Coordinator: SU
Executing Partners: CUED, FPMs, SU

To evaluate the progress of the project quantitatively, we plan to take part in the ARPA evaluation programme (and any future European evaluation programme) under the assumption that they follow a similar format to the last few years. We have participated since 1992 and have benefited greatly from the supply of resources such as acoustic and language model training data.

Task T7.3: Demonstrations

Task Coordinator: CUED
Executing Partners: CUED, FPMs, INESC, SU

Under WERNICKE we developed a near-real time 20,000 word speech recognition system. This proved to be an effective demonstration of the work we have carried out.

The following prototype systems will be developed and demonstrated during the present project:

  1. Large vocabulary (e.g., WSJ), continuous read speech recognition,
  2. Task independent demo, in which lexicon and grammar can be designed on the spot,
  3. Small, fast and robust small vocabulary recognition,
  4. Multilingual recognizers,
  5. New technology demonstration, based on new research outputs (e.g., new acoustic models, as SPAM-based recognition).

WP7: Milestones

M7.1
(T0+12): Real time large vocabulary recognition of continuous input for US and UK English available.
M7.2
(T0+24): First version of large vocabulary recognition of continuous input for French available.
M7.3
(T0+24): First version of application specific (as listed in Task T7.3) recognition systems available.
M7.4
(T0+36): Large vocabulary recognition of continuous input for Portuguese available.
M7.5
(T0+36): Final version of application specific (as listed in Task T7.3) recognition systems available.

WP7: Deliverables

D7.1
(T0+12): Real time large vocabulary rec. system of continuous input for US and UK English.
D7.2
(T0+24): Large vocabulary rec. system of continuous input for French.
D7.3
(T0+24): First release of application specific recognition systems.
D7.4
(T0+36): Enhanced versions of the US, UK and French recognizers.
D7.5
(T0+36): Large vocabulary recognition system of continuous input for Portuguese.
D7.6
(T0+36): Final release of application specific recognition systems.
D7.7
(T0+36): Report on applications and tools available.

Work Package WP8: Results Dissemination and Exploitation

Work Package Manager: SU
Executing Partners: all partners, including Industrial Advisory Board

Task T8.1: Information Dissemination

Task Coordinator: SU
Executing Partners: all partners, including Industrial Advisory Board

PhD and MSc theses are expected to be developed by the European partners, within the scope of this project. Furthermore, since most of the researchers involved in the project teach at several universities (Mons, Cambridge, Sheffield and Lisbon), the experience gained in this project will have a direct influence on courses on speech processing and on neural networks at these universities; same remark also applies to ICSI where they also have PhD students and where most of the staff members also teach at the University of California at Berkeley.

In order to inform contractors from other ESPRIT and BRA projects on speech recognition and/or neural networks and researchers in general of our progress, the partners will do their best to organize at least one international workshop in which results will be presented, discussed, and compared with other state-of-the-art systems. Ideally, this workshop should be multi-disciplinary since hybrid HMM/ANN systems are of potential interest to many different research areas (as briefly discussed in Section 1.2). In the framework of WERNICKE, and to test the interest of such a workshop, a proposal for organizing a NIPS (Neural Information Processing Systems, Denver, December 1995) post-conference has been submitted.

To ensure information dissemination and visibility of the project, results obtained will be regularly presented at international conferences and published in international journals. Also, through participation in international speech recognition evaluations sponsored by ARPA (and any sponsored by the EC) we shall maintain a high profile in the international speech community in addition to being able to measure our progress against a standard ``yardstick''.

Making contact with researchers working on other ESPRIT projects or other speech recognition and/or neural network research teams will not pose any difficulties since all of the contractors have been or are engaged in European research programs and have very strong links with universities and educational institutes.

Task T8.2: Exploitation of Results

Task Coordinator: SU
Executing Partners: all partners, including Industrial Advisory Board

This task will mainly consist in having meetings with the SPRACH industrial advisors to regularly inform them on the progress of the progress and to get their input regarding possible applications we should focus on. The expectations of our industrial advisors as well as the application areas they are interested in are discussed in Appendix B.2.

Each time this will be possible, we will do our best to ensure that they can easily use the theoretical as well as software outputs resulting of SPRACH.

On the other hand, most of the partners are also in contact or collaborating with other industries that will also be happy to take advantage of the results.

Mechanisms for the take-up of results are further discussed in Section 5.

In the following milestones, meetings with the industrial partners have been planned once a year (on top of the kick-off meeting), although more regular meetings will be organized if necessary.

WP8: Milestones

M8.1
(T0+1): idem as M0.1, kick-off meeting with Industrial Advisory Board
M8.2
(T0+12): Meeting with Industrial Advisory Board to discuss results and applications of interest.
M8.3
(T0+24): Meeting with Industrial Advisory Board to discuss results and applications of interest.
M8.4
(T0+36): Meeting with Industrial Advisory Board to summarize what is available from the project.

WP8: Deliverables

No deliverables per se for WP8, but application software delivered in WP7.

Personnel Requirements

 

Work Package Timeline

 

file=bar.eps,width=215mm,angle=90

Milestones

 

Deliverables

 



next up previous contents
Next: PROJECT MANAGEMENT Up: No Title Previous: APPROACH



Jean-Marc Boite
Mon Dec 9 18:18:02 MET 1996