next up previous
Next: Porting SPRACH technology to Up: No Title Previous: Overview

Broadcast News system development

Over the last 2 years CUED developed a baseline Broadcast News system. This year, we are choosing a number of promising developments from the SPRACH project and attempting to find ways to integrate them into the baseline system. It was agreed that for convenience in integrating ideas from distant partners we would, at least initially, use n-best lists and word lattices from the baseline system; these would be rescored using the new approaches from the partners. Work on the baseline system that would generate the lattices will continue at CUED.

The cross-site collaborative work will primarily consist of the following components:

In addition to these collaborative efforts, the baseline system itself will be further developed at CUED. The major developments planned are noise estimation, cross-word CD modelling and training, and vocal tract length normalisation/adaptation using covariance.

In the next two months, we will agree on common dev test sets with known characteristics and a core 64K vocabulary. Sheffield has also promised to provide a new decoder (Noway) release. Sheffield and ICSI will generate a baseline set of static pronunciation models, incorporating confidence measures, and Sheffield will generate an initial list of compound lexical items (e.g. top 200 bigrams, or most frequent named entities). Sheffield and CUED will test and release a segmenter, and CUED will conduct tests on their existing BN system using the new decoder, different decoder parameters as suggested by Sheffield, and the new segmenter. ICSI will also build up a preliminary MLP-based system in this period incorporating RASTA-PLP. FPMS will build a preliminary MLP-base system as well.

It should be noted that ICSI has vastly greater SPERT-based computational capability than the other sites. This will be made available to the collaboration for all of the MLP-based training, particularly as time gets short before the evaluation.

By the end of the SPRACH project, we will have an improved Broadcast News system incorporating the most effective ideas from each of the sites. Those that are most promising will be considered for inclusion at a more tightly integrated level (i.e., in the decoder) for the THISL project.


next up previous
Next: Porting SPRACH technology to Up: No Title Previous: Overview
Christophe Ris
1999-07-06