The documents were then processed by NED; each word was either marked up with one of tags described earlier or not marked at all. In cases of unresolvable type ambiguity, the version of NED used here adds a plain ``NAME'' tag. Out of which, tags on in-vocabulary words were removed before counting the statistics. For simplicity, NE tags (``ORGANISATION'', ``PERSON'', ``LOCATION'') and an ambiguity case tag (``NAME'') were used (i.e., total 4 tags in ). Temporal and number expression tags were not used although they might also provide some useful information on the context. An identifier ei was set for each word wi according to Definition (3.1) and three sets of LMs were generated;
Equations (3.2) and (3.3) may be used
to estimate the language model probabilities when decoding.
Alternatively, (3.2) may be approximated by a maximization:
Speech recognition experiments were carried out on the DARPA North American Business News task, using the context-independent ABBOT system. The OOV rate of the test set was evaluated by comparing the transcription with the vocabularies of three LM sets described earlier. There were a total of 6059 words in the transcription; 5809 (95.9%) were included in the trigram vocabulary . About 70% of the 250 OOV words were included in the tagged unigram word set, reducing the effective OOV rate from 4.1% to 1.3%. The distribution across tags is shown in table 3.1.
Decoding was performed by the single pass NOWAY decoder 3.2. Note that, when using the tagged LMs, the decoding was performed with a vocabulary of words ( pronunciations)3.3, represented as pronunciation tree containing around nodes (phone models). In addition to the three LM sets, a variation to the tagged LM with UNK extension was also tested. This variation used a flat estimate of P(w|e) (set to 10-5, since there were about 105 in ).
Table 3.2 shows the word error rate (WER) for each LM set on this task. The tagged LMs, with an effective vocabulary of 126 thousand words, reduced the WER by about 14% compared with the conventional LM. Although there was no significant difference in WER between tagging words with NEs in versus tagging all with UNK, there was an improvement in performance by using an estimated unigram model for P(w|e) over a flat estimate. This table also gives an indication of the number of OOV words relative to (but in ) that were detected. 2.8% of the words in the test set fall into this category, and these results indicate that the tagged LM approach detected 70% of them.