Work Package Manager: SU
Executing Partners: all partners, including Industrial Advisory Board
The objective of this WP is to effectively disseminate and exploit results arising from the project.
No deliverables per se for WP8, but application software delivered in WP7.
Task Coordinator: SU
Executing Partners: all
The objective of this task is to efficiently disseminate the results arising from the project.
The SPRACHWORKS software release, scheduled for April 1997 will include:
Different alternatives regarding the distribution license agreement were considered:
Solution (2), i.e., no commercial use agreement, was finally accepted by all of the partners.
Bundling this software package however requires a major effort in which CUED, SU, FPMs and ICSI will have to be actively involved. It was decided that
The directory structure for the SPRACHWORKS package was agreed upon by comparing the current directory structure of STRUT (FPMs), DRSPEECH (ICSI) and ABBOT (CUED). The resulting directory structure is represented overleaf:
Figure 8.1: Directory structure for SPRACHWORKS software release.
The key developments achieved under the HP-ABBOT development were:
In the following, we list all the publications from the SPRACH consortium related to this project. Some of these publications are attached to this document for information or included as deliverables.
Konig, Y., Bourlard, H., and Morgan, H. (1996), ``REMAP --- Experiments with Speech Recognition,'' IEEE Proc. Intl. Conf. on Acoustics, Speech, and Signal Processing (Atlanta, GA), pp. VI:3350-3353, May 7-10, 1996.
Bourlard, H., Konig, Y., Morgan, N., and Ris, C. (1996), ``A New Training Algorithm for Hybrid HMM/ANN Speech Recognition Systems,'' Proceedings of VIII European Signal Processing Conference (EUSIPCO'96) (Trieste, Italy), pp. 101-104, Sep. 10-13, 1996.
Bourlard, H., Dupont, S., Hermansky, H., and Morgan, N. (1996), ``Towards Subband-Based Speech Recognition,'' Proceedings VIII European Signal Processing Conference (EUSIPCO'96) (Trieste, Italy), pp. 1579-1582, Sep. 10-13, 1996.
Bourlard, H., Hermansky, H., and Morgan, N. (1996), ``Towards Increasing Speech Recognition Error Rates,'' special-interest invited paper, SPEECH COMMUNICATION, vol. 18, no. 3, pp. 205-231, June 1996.
Bourlard, H. (1996), ``Copernicus and the ASR Challenge -- Waiting for Kepler,'' Invited Talk, Proceedings of ARPA Speech Recognition Workshop, Arden House, NY, pp. 157-162, Feb. 18-21, 1996.
Bourlard, H. (1996), ``Reconnaissance Automatique de la Parole: Modélisation ou Description?,'' Actes des XXIèmes Journées d'Etude sur la Parole (JEP), Plenary Talk, pp. 263-272, Avignon (France), June 1996.
Bilmes, J., Morgan, N., Wu, S.-L., and Bourlard, H. (1996), ``Stochastic Perceptual Speech Models with Durational Dependence,'' Proc. of Intl. Conf. on Spoken Language Processing (ICSLP), Philadelphia, Oct. 3-6, 1996.
Bourlard, H., Konig, Y., and Morgan, N. (1996), ``A Training Algorithm for Statistical Sequence Recognition with Applications to Transition-Based Speech Recognition,'' IEEE SIGNAL PROCESSING LETTERS, vol. 3, no. 7, pp. 203-205.
Bourlard, H. and Dupont, S. (1996), ``A New ASR Approach Based on Independent Processing and Recombination of Partial Frequency Bands,'' Proc. of Intl. Conf. on Spoken Language Processing (ICSLP), Philadelphia, Oct. 3-6, 1996.
Bourlard, H. and Morgan, N. (1996), ``Connectionist Techniques,'' to be published in the NSF-EC Survey on the STATE OF THE ART IN SPEECH AND NATURAL LANGUAGE PROCESSING, R. Cole, J. Mariani, H. Uszkoreit, A. Zaenen, and V. Zue (Eds.), Springer Verlag, 1996.
Bourlard, H. and Dupont, S. (1997), ``Subband-Based Speech Recognition,'' accepted to IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing, Munich, April 1997.
Dupont, S., Bourlard, H., Deroo, O., Fontaine, V., and Boite, J.-M., (1997), ``Hyrbid HMM/ANN Systems for Training Independent Tasks: Experiments on 'Phonebook' and Related Improvements,'' accepted to IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing, Munich, April 1997.
Renals, S. (1996), ``Phone deactivation pruning in large vocabulary continuous speech recognition'', IEEE Signal Processing Letters, 3, 4--6.
Renals, S. and Hochberg, M. (1996), ``Efficient evaluation of the LVCSR search space using the NOWAY decoder'', Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Atlanta GA, 1, 149--152.
Neto, J., Martins, C. and Almeida, L. (1996) ``An Incremental Speaker-Adaptation Technique for Hybrid HMM-MLP Recognizer'', Proc. of Intl. Conf. on Spoken Language Processing (ICSLP), Philadelphia, 1289--1292.
Clarkson, P. and Robinson, T. (1997) ``Language Model Adaptation Using Mixtures and an Exponentially Decaying Cache'', accepted to IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing, Munich, April 1997.
Waterhouse, S., Kershaw, D. and and Robinson, T. (1996) ``Smoothed Local Adaptation of Connectionist Systems'', Proc. of Intl. Conf. on Spoken Language Processing (ICSLP), Philadelphia.
Cook, G., Kershaw, D., Christie, J., Seymour, C. and Waterhouse, S. (1997) ``Transcription of broadcast television and radio news: The 1996 Abbot System'', accepted to IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing, Munich, April 1997.
Cook, G. and Robinson, T. (1996) ``Boosting the Performance of Connectionist Large Vocabulary Speech Recognition'', Proc. of Intl. Conf. on Spoken Language Processing (ICSLP), Philadelphia.
As a result of SPRACH and the earlier WERNICKE project one PhD thesis entitled ``Phonetic Context-Dependency in a Hybrid ANN/HMM Speech Recognition System'' has been submitted  and two more close to completion [31,39], all at Cambridge University. Related issues were explored in Masters' theses, namely:
The SPRACHWORKS software release is scheduled for April 1997, with an alpha-release scheduled in February 1997.
Although the HP-ABBOT project is now complete, maintaining and managing this software tree is an on-going task.
Task Coordinator: SU
Executing Partners: all
The objective of this task is to efficiently exploit results arising from the project, with the Industrial Advisory Board playing a leading role.
Section 6 of the technical annex discussed the posibility of forming a ``spin-off'' company to assist in the exploitation of the results. An agreement has been reached between an existing start-up company, SoftSound, and Cambridge University whereby the university has granted a licence to the company. It is expected that this will be an on-going relationship and it is hoped that it will provide a suitable mechanism for the exploitation of results from this project.
The BBC have experimented with the ABBOTDEMO system using a collection of recordings made for BBC programmes. Although ABBOT can be spectacularly successful when dealing with speech read, 'live' to a microphone, with recorded material which we would typically use, the performance is markedly worse.
The initial performance of the ABBOTDEMO system on the BBC data is as follows:
BBC would suggest that the reported variations in performance should be investigated as they are fundamental to the wider application of CSR. (BBC can put these recordings on the SPRACH ftp site if appropriate.) BBC is also planning further investigations to identify the limitations of the recogniser as more recordings are gathered and will report further when more results are available.
A new ESPRIT Long Term Research project (THISL) is scheduled to start in February 1997. The object of this project is information retrieval from broadcast speech and will draw on the results of this project (as well as the preceding WERNICKE project).
This WP is progressing very well with a wide selection of publications arising from the project, and dissemination of project advances through follow-on projects and the planned release of the SPRACHWORKS software.