Although we have maintained our stated goal from last year of producing a package comprising all the partners' speech tools on a single CD-ROM, the time scale to achieve this end has been considerably elongated in view of the unexpectedly intricate challenge it presents.
One major problem was that tools from different sites had different philosophical approaches - e.g. some were low-level tools emphasizing flexibility while others provided more limited functionality but in a more user-friendly form. Many of the tools from different sites work with and complement each other well, but a unified package of all tools was unwieldy and not necessarily useful. Another issue was that in combining many tools in one bundle, a more unified approach to various stages of the training/recognition process became apparent - resulting in the development of the new tools that have already been described in section 6.4.2.
The current vision is for all partners to work to a consistent basic directory structure (described in the previous report), but to have several independent pieces, rather than attempting the very considerable effort of making all units work interchangeably.
In the coming year, this integrated package will be extended to incorporate the remaining programs and scripts needed to train recognition systems on new speech corpora, including our BABYLEX language-model tools. We have started working on the HTML-based tutorial for this package, which will guide new users through the entire construction of a speech recognition system within the tools (http://www.icsi.berkeley.edu/dpwe/isrintro). This, in conjunction with the man pages supplied for each component, will give interested researchers everything they need to get start using our tools in their own projects.