next up previous contents
Next: Combining multiple time-scale features Up: A multi-stream approach Previous: Multi-Stream Statistical Model

Subband-based speech recognition

As a particular case of the multi-stream approach, a new speech recognition system based on independent processing and recombination of partial frequency bands was recently developed and tested on several clean and noisy databases. Experiments were performed on the NUMBERS'93 database, a continuous speech telephone database collected by the CSLU at the Oregon Graduate Institute [18]. It consists of numbers spoken naturally over telephone lines on the public-switched network. The subband-based system had four bands and used subband log-RASTA-PLP features. Recombination was done at the state level with a multilayer perceptron with one hidden layer. Results, reported on Figure 5.5, clearly show that the multiband approach yields much less degradation than the classical approach in the case of band limited noise.

Figure 5.5: Error rate for speech + band limited noise in the first frequency band (first formant) and various SNR levels. Solid line is for the multiband system, dotted line is for the full-band system.

Christophe Ris