Work Package Manager: FPMs
Executing Partners: CUED, FPMs, INESC, SU, ICSI
We have incorporated new hardware and software from the same group at ICSI that provided the successful system infrastructure for WERNICKE with the RAP computers. The new systems, which is vector-based, is required in SPRACH to (1) support our research (e.g., fast assessment of new research results on moderate sized tasks), and (2) provide fast training on large databases. Towards these ends the ICSI team has developed a vector supercomputer on a chip called the Torrent. Software has been developed for this chip that is analogous to the routines currently used on the RAP. Also, in the framework of WERNICKE, ICSI has finalized the building a single-processor S-bus board called SPERT that has been made available to the SPRACH partners for their research and development works.
For the first year we had the following milestones:
For the first year we had the following deliverables:
The text of the report below as well as Annex 6.1 and 6.2 constitute this deliverable. Software corresponding to deliverable 6.2 is available on request.
Task Coordinator: FPMs
Executing Partners: ICSI
As noted in our proposal and technical annex, this task is essentially a placeholder to correspond to the acquisition of SPERT hardware by the research partners.
All partners have received working SPERT hardware from ICSI.
As noted previously, the Torrent vector microprocessor and the SPERT board that is built around it were designed and constructed under separate funding. The boards take up two S-bus slots and are typically used with Sparc 5 systems. They are built using Silicon Valley subcontractors, but all final testing is done at ICSI.
Task Coordinator: FPMs
Executing Partners: ICSI
Tools and libraries are required to support neural network training and use on the Torrent processor. The software being developed falls into two classes:
An initial release of SPERT Development Environment was made in January, achieving milestone M6.2. Two subsequent releases have resulted in a stable environment to SPERT program development. The first public release of the QuickNet package was made in February, achieving milestone M6.3 and resulting in a complete training system (deliverable D6.2).
During February, researchers from FPMs, INESC, CUED and SU visited ICSI for a one week training course. Topics covered included the Torrent vector architecture, how to write, build and execute SPERT programs and how to use and modify the QuickNet programs. The associated documentation package included to Torrent Architecture Manual (deliverable D6.1).
The SPERT Development Environment was largely funded by other sources and was mostly complete at the start of the project. The majority of the effort on this package has been ongoing support rather than major new developments: commands have been enhanced, bugs fixed and libraries incrementally extended. One major addition is a new device driver for the SPERT board. This provides added performance and greater security when using SPERT boards on multi-user workstations.
The main target of development effort has been the QuickNet neural network training package and associated Matrix/vector libraries. The work can be divided into several projects.
In the traditional hybrid connectionist speech recognition system, the MLP or RNN is trained using a file containing floating point feature vectors and integer phonemic labels; this is the approach used in the original QuickNet training program. In contrast, the recently developed streams-based training is more flexible, combining multiple feature files, each with independent preprocessing and windowing whilst also allowing soft target vector files. This comprehensive set of basic facilities allows simple and efficient training of nets for many of the research areas being investigated in the SPRACH project, including full discriminant training (REMAP-task T5.2) and multiband recognition (T5.4).
Considerable effort has been expended to verify the accuracy of the fixed-point MLPs produced by the SPERT system. An initial version of the training program used 16 bit weights - this was efficient and proved adequate for classification tasks on a range of net sizes. However, the implementation produced sub-standard nets for some tasks, particularly when estimating probabilities. Fortunately, the flexibility of the fully-programmable Torrent vector unit allowed the MLP to be recoded to use 32 bit weights where necessary. The new 32 bit training class is typically 30% slower than the 16 bit version, but the recognition performance of the resulting nets is comparable to equivalent floating point nets; on larger networks or on networks trained in ``bunch'' mode (updating weights after a block of patterns, e.g., 32) this performance cost essentially goes away.
Although it is intended that the SPERT board be used for the majority of neural network training tasks, there are situations when floating-point training on a workstation is more appropriate. Specifically, this may be sensible when no SPERT board is available, when a small net is being trained, or during fixed point debugging or validation. To ensure maximum performance in these situations, related work at ICSI and the University of California at Berkeley has been leveraged to produce efficient floating point matrix/vector routines.
The development plans for task 6.2 in the second year of the project are:
Task Coordinator: SU
Executing Partners: CUED, FPMs, INESC, SU
The baseline (WERNICKE) recognizer runs on standard Unix workstations (including SUN, HP, SGI and PC/Linux). However, the training systems require specialized high-performance hardware. In WERNICKE training was performed using the RAP hardware and software platform. The use of Torrent-based systems in SPRACH will require applications code (particularly neural network training software) to be ported from the RAP to Torrent. Although this porting effort will be reduced by the similarity of the new matrix-vector calls to the older RAP libraries, an additional effort will be required by the partners to fully exploit the architecture of SPERT and the SPRACHSTATION. This task will also cover any coding effort necessary to implement new algorithms on this hardware.
Currently a recurrent network training system has been written for the SPERT board, but is not as accurate in performance as the original RAP system. However, this system, once operational, has the potential to be far more flexible and useful than the old RAP system (i.e. allowing much larger network training.)
FPMs has ported the ICSI MLP algorithm to its STRUT (see section 7.3.3) software.
Concerning experimenting with WSJ0 MLP training on SPERT, Sheffield still haven't been successful, but is possibly getting close.
Difficulties with training have arisen due to the lack of precision available with the fixed point libraries. Although these have been addressed for the MLP, many functions that require 32 bit processing in the optimisation technique used in the recurrent network do not currently exist.
For the recurrent network, once the code is working properly, future developments of a full assessment of different optimisation techniques are possible, along with the gain in performance due to increasing the number of state units.
Sheffield will clean WSJ0 training path, then experiment with larger training sets (e.g. WSJ0 + WSJ1) using MLPs.
Partners have received their SPERT boards, and neural networks have been ported in fixed point. The version for recurrent NN still need improvement, while the MLP implementation is now satisfactory. FPMs has developed a full speech recognition and training software.