next up previous contents
Next: Task 6.2: Future Developments Up: Task 6.2: Software Previous: Task 6.2: Status

   
Task 6.2: Technical Description

During this year, ICSI has developed a number of new tools as part of our speech recognition system:

FEATOOLS is an environment for experimenting with different forms of speech feature. The 'plumbing' parts of managing lists of sound files and collating outputs into feature files is handled automatically, and the experimenter has only to provide a very simple implementation of the feature calculation routine that employs 'bare bones' input and output formats. FEATOOLS will then very quickly run this routine over an entire corpus, using multiple workstations on a network where available.

FEACALC is a new encapsulation of the RASTA-PLP routines to cover the alternative situation where the desired feature form falls within the conventional parameters. By exploiting the new "dpwelib" soundfile interface library as well as the "quicknet" feature-file interfaces, FEACALC provides a single program to convert a set of soundfiles - in any of the formats usually encountered - directly into a feature archive of the desired form, replacing as many as six other programs in our previous feature-calculation procedure.

DPWELIB/SNDUTILS is the universal soundfile interface library used in FEACALC, along with a set of command-line utilities for playing, recording comparing and modifying soundfiles. The library presents a uniform interface for reading and writing AU, AIFF, WAV, NIST and ESPS soundfiles, as well as raw binary and the "Abbot online" format defined by Cambridge.

APRL/RECOGVIZ (Audio PRocessing Language / RECOGnizer VIsualiZation) is a collection of sound-processing, user-interface and visualization components implemented as loadable extensions to the industry-standard TCL scripting language. This combination allows for very rapid development of professional-looking interactive systems. We have re-implemented our standard speech recognition demonstrator ("BeRP") within this system, as well as producing various other visualization tools: RECOGVIZ hooks into each stage of our conventional recognition chain to display the spectrogram, feature-vectors, posterior probabilities and phoneme and word label assignments; it can also display different recognizer configurations side-by-side for diagnosing the relative merits of different approaches.


next up previous contents
Next: Task 6.2: Future Developments Up: Task 6.2: Software Previous: Task 6.2: Status
Christophe Ris
1998-11-10