next up previous
Next: Work Package WP4: Application Up: Structure, Workpackages and Tasks Previous: Work Package WP2: Lexica

Work Package WP3: Language Models (LM) and LM Adaptation

Adaptive language models have been studied. Whilst they provide reasonable decreases in perplexity, the expected reductions in word error rate were not significant for the Broadcast News task. This has started a new area of investigation to explain our findings [4], and work has been carried out to discover better measures of the power of a language model other than perplexity. Work on language modelling and language model adaptation using Latent Semantic Analysis has continued (described in a submission to Journal of Natural Language Engineering [12]), but this was not the main focus of the year 3 work.

We have continued investigating the use of named entities, both as a means of producing semantically tagged language models (discussed in a submission to ICASSP-99 [13]) and for the task of automatically extracting named entities. The work on named entities extraction, using both statistical and rule-based approaches, is currently being evaluated as part of the Hub-4E ``IE-NE'' spoke (due: 14 Dec 1998), and is discussed in the Broadcast News report. We will present the final system at the review meeting in Lisbon, but is unlikely that NIST will have processed the results by then. They will be presented in full in our site report at the DARPA Broadcast News Transcription meeting (February 1999).


next up previous
Next: Work Package WP4: Application Up: Structure, Workpackages and Tasks Previous: Work Package WP2: Lexica
Christophe Ris
1999-07-06