next up previous contents
Next: Examining Phonetic Information Transmission Up: Task 5.1: Perceptual Models Previous: Introduction

Database & System Description

We use the Oregon Graduate Institute NUMBERS95 database, which comprises continuous digits and numbers recorded over the telephone as a part of census data collection. Our baseline full-band system is an HMM/MLP based [5] system with an WERR of 7.9% on the test set.

For our multi-band system, we divide the frequency range into four bands, derive RASTA-PLP features as well as energy and corresponding deltas. We train one MLP on each sub-band. The total number of parameters in the four MLPs and the full-band system are roughly equal. The frame-by-frame information from the four sub-band streams is combined using a merger MLP, which takes the output of the sub-band MLPs as input and estimates the probability of 56 phones as output. The WERR on the test set for this merged multi-band system is 8.2%. The performance difference between the baseline and multi-band systems is not statistically significant.



Christophe Ris
1998-11-10