The TCTS (Théorie des Circuits et Traitement du Signal - Circuit Theory and Signal Processing) group of Faculté Polytechnique de Mons, Mons, Belgium, has acquired a broad experience during the last 14 years in the field of signal processing in general and speech processing in particular. Speech synthesis is part of this research since 1980, text-to-speech since 1981 and speech recognition since 1983. More recently, work also started on automatic printed and handwritten character recognition.
In the framework of the FEDER initiative, the TCTS group was recently awarded a 5 years research grant from the European Communities and the Belgian Government (Ministère de la Région Wallonne) to reinforce its activities and to boost the industrial activity in the region. In that framework, new research and development activities were initiated in the field of speech recognition, text-to-speech, speech coding, speaker verification and automatic character recognition, with particular emphasis on their application in multimedia systems (e.g., multimodal user interfaces and multimedia document access). This work should lead to real-time prototype demonstration systems.
The principle areas of interest of the TCTS group are as follows:
The TCTS group also benefits from international collaborations with first class institutions (e.g., International Computer Science Institute, Berkeley, CA) and involvement in several European projects (e.g., HIMARNNET, ESPRIT Project 6438, October 92-October 95; COST 232, 249, 250) and national projects.
On top of research projects, the TCTS group also has R&D contracts with different industries.
Cambridge University Engineering Department has approximately 1000 undergraduates and 400 graduate students and forms about 10% of the University. There are about 200 teaching and research staff, supported by a similar number of technicians and secretaries. In addition in a typical year there are some 30 visitors, from universities and industry in the UK and overseas, working in the Department. The department consists of six divisions, the Information Engineering Division consisting of Control, Communications, Computing, and the Speech, Vision and Robotics (SVR) Group.
The SVR Group was founded by the late Professor Frank Fallside in the early 1970's when the main interests were in speech processing and control applications. In the mid 1980's the group developed a strong interest in the theory and application of neural networks and this led to a widening of the group's research to include vision and robotics. Today the guiding principle of all research in the group is that a well-designed engineering system must be based on a sound mathematical model. In this regard, neural networks represent just one of a wide range of applicable techniques. Others include stochastic processes such as hidden Markov models, Bayesian inference, invariant transformations in 3D-geometry, Wiener and Kalman filtering, classification and regression trees and genetic algorithms.
The principle areas of interest are as follows:
The University of Sheffield has approximately 13,000 students and 1,000 full-time academic staff. The Department of Computer Science was established in 1982 and since then it has established national and international renown for many aspects of its teaching and research. A wide range of undergraduate and taught MSc degrees and courses are offered in the disciplines of computer science and software engineering, with particular strengths in formal methods, natural language processing, parallel processing, speech and hearing, neural networks, graphics and communications networks. The teaching activities are a reflection of the strength and depth of research of the Department's research profile.
The department has had a research group in speech and hearing since 1986. During this time the group has had active research interests in hybrid approaches to speech recognition and computational modeling of auditory scene analysis. More recently work has been underway in neural computing, recognition of occluded speech and models of music perception. In 1992, a natural language group was established in the department with research interests including information extraction, machine translation and dialogue modeling.
With the objective of promoting a collaborative research infrastructure in speech and language, the University of Sheffield established ILASH --- the Institute for Language Speech and Hearing --- in 1994. ILASH currently links 40 academics in the university working in various aspects of speech and language research. In addition to Computer Science, university departments represented in ILASH include Electronic Engineering, Information Studies, Speech Science, Psychology and Medical Physics.
In detail, active research expertise at the University of Sheffield includes:
INESC is a private, non-profit institution, dedicated to research, development and training in advanced technological areas. INESC was created in 1980, to become an interface between the Telecommunications and Information Technology sectors and the Portuguese University system. INESC is owned by the national Telecommunications operators and the technical Universities of Lisbon, Oporto, Coimbra and Aveiro.
INESC aims to contribute towards the modernization of the Portuguese university system, by providing a modern and stimulating environment for its staff and students to conduct pure and applied research, by participating quite actively in the urgent task of providing vocational training in advanced technological areas for young professionals, and by linking-up the academic push for qualified know-how with the business pull for manpower.
The Neural Networks and Signal Processing Group of INESC was created in 1984. At the time it was mainly dedicated to activities in digital signal processing, especially on digital speech coding. Through its evolution the group has expanded its activities, first to speech recognition and speech synthesis. Research on neural networks started in 1986, and has originated some important scientific contributions, such as the recurrent backpropagation algorithm, a distributed decorrelation algorithm and different techniques for accelerating the network training.
The group has an acknowledged reputation in the neural network field, and also has advanced expertise in hidden Markov model (HMM) speech recognizers. Through its engagement in WERNICKE, the group has acquired a considerable knowledge in the specific field of hybrid HMM/neural network recognizers for large vocabulary, continuous, speaker independent speech, and on speaker adaptation algorithms.
INESC's main role in this project will consist of the development of techniques for automatic porting of the recognizers to new languages, through the automatic learning of new lexica and phonological identities.
ICSI is a non-profit research organization primarily funded by long-term support from European governments (Germany, Italy, and Switzerland), and is also funded by NSF and ARPA. It is also closely affiliated with the Electrical Engineering and Computer Science Department of the University of California at Berkeley, and is located just off the central UC campus in downtown Berkeley. There is an average of about eighty scientists in residence at ICSI including permanent staff, postdoctoral fellows, visitors, and affiliated faculty and students. The Institute occupies a newly designed 28,000 square foot research facility. The facility is completely wired for high-speed communications, and contains its own computer room and hardware laboratory. Custom test boards interface experimental VLSI circuits with test equipment. The main communications link with UC is a high bandwidth microwave system, but the building is also one of a few American research sites that are part of an experimental high-speed fiber network.
ICSI is 20 walking minutes away from Soda hall, home of the Computer Science Division at UCB. This permits a strong personal interaction and technical collaboration between researchers at the two institutions. A number of the scientists at ICSI are also on the UCB faculty, including the ICSI investigator most concerned with this proposal, Dr. Nelson Morgan.
ICSI includes groups that specialize in theoretical computer science, wide area networks, and Artificial Intelligence applications such as vision and knowledge representation. The group led by Nelson Morgan develops hardware and software for connectionist research, and additionally focuses on the application of speech recognition. In collaboration with the AI group, Dr. Morgan's group also conducts research into speech understanding systems.
In detail, active research expertise in Dr. Morgan's group at ICSI includes:
The British Broadcasting Corporation is the public service broadcaster in the United Kingdom. It is responsible for the making of programmes, the planning of services and the emission of the programmes for two national channels, five national radio networks and a countrywide network of local radio stations. The BBC is also responsible for the United Kingdom's broadcasting to overseas through the agency of the World Service.
The BBC's Research and Development Department is responsible for innovating and assessing new broadcasting systems and techniques and for providing specialist and operational departments with technical advice based on theoretical studies and experiments. Projects undertaken by the department impinge on all the aspects of broadcasting, including television cameras, recording, video, audio and optical systems, signal processing, acoustics, transmission systems, service planning antennas and propagation.
The BBC has taken an important role in several European projects, including the Eureka, RACE and ADTT programmes. It has participated in fourteen RACE projects, leading three of them.
For some time the BBC has recognised that there are several applications in broadcasting which would benefit from speech recognition, machine translation and speech synthesis techniques. Some applications already identified include:
The BBC therefore recognises the importance of further research into speech recognition techniques in order to promote the development of practical systems for commercial use. It is therefore prepared to contribute as an industrial advisor for the SPRACH project to ensure that the opportunities for this technology in broadcasting are not overlooked.
CSELT is deeply involved in acoustic technology research and development, and in particular in speech recognition and understanding. Since 1988 a group has been established for speech recognition with neural networks. As SPRACH is about speech recognition with neural networks there is a direct interest of CSELT in monitoring the technology developed in the current project and comparing it with its own technology.
The expectations about SPRACH is that this project may move a significant step forward in the neural network technology for speech recognition, or from the point of view of the theoretical models, as well as practical applicability of the methods (efficiency, recognition results).
Daimler-Benz is since many years active in the field of speech procesing and is doing research in very mnay different field in this topic. The research group provides the basis for advanced applications of speech systems within the products of the concern and as a product for selling speech products themselves. The department of speech understanding systems within Daimler-Benz Research is active in most areas of speech and language processing from speech coding, speech enhancement, recognition, understanding, dialogue systems and speech synthesis.
The activities include prototype demonstrations of advanced applications to evaluate the chances and risks of new technologies and to develop the ergonomic basis for advanced information technologies, especially in the area of new advanced forms of man-machine interaction using speech and language technologies. This includes the efficient access of stored information using a flexible speech dialogue.
Some of the relevant activities include:
Daimler-Benz is highly interested in the outputs of SPRACH due to the fact that its interest in speech recognition is still one of the kernel activities in speech processing. Especially the problems of robustness of such systems will decide about the further applications in the future. We expect substantial results concerning the scope of application of such speech recognition systems, especially addressing specific questions of robustness, ease of vocabulary design and specific multilingual aspects.
The Advanced Human Interface of Thomson-CSF is part of the Corporate Research Laboratories of Thomson. It conducts research in the area of Man-Machine Interface and Language Engineering for the Thomson group (Thomson-CSF and Thomson Consumer Electronics)
Thomson-CSF is active in the areas of professional and defense electronics (detection, Control and Command systems -- including Air Traffic Control, etc.) and Computer Services while Thomson Consumer Electronics' activities are in consumer electronics and broadcasting systems. Thomson Consumer Electronics is one of the leading companies for TV sets manufacturing (under several brands: Thomson, Saba, Telefunken, Ferguson, Brandt, Nordmende, Proscan, RCA, General electric)
The laboratory has an average staff of 6/8 full-time research engineers and 4/8 students (graduate or undergraduate) who have experience in one or several areas of Man-Machine Interaction.
The main topics investigated in the laboratory are:
The most recent prototypes developed in the laboratory are in the field of Man-Machine Dialogue and Spoken Command systems. These include: