next up previous contents
Next: Task 6.2: Software Up: WP6: Hardware and Software Previous: WP6: Milestones and Deliverables

Task 6.1: Hardware


Task Coordinator: FPMs
Executing Partners: ICSI

Because the SPERT vector-processor board plugs into the standard SBUS expansion bus, it is possible to attach multiple boards to a single host computer; this year, we addressed the technical issues involved in using a "MultiSPERT" configuration to increase processing throughput. In our best result, we achieved a peak training rate of over 500 MCUPS (millions of connection updates per second) using a 5-board system, which, thanks to the parallelism and other speed-ups, is more than ten times faster than previous SPERT performance.

We have worked on two alternative approaches to exploiting multiple processors for training neural nets. "Network Parallel" breaks the net into smaller pieces on each processor, permitting very large networks with thousands of hidden units suitable for the recognition of informal speech. "Pattern Parallel" employs different training examples on each processor to accelerate the training of smaller nets; however, the additional communications required in this mode make it suitable only with a large "bunch size" (i.e. number of patterns presented between network weight updates). Both approaches work well for our phoneme probability estimator tasks, although the network parallel approach is most interesting in allowing rapid training of nets larger than was previously possible. One major issue in harnessing the power of multiple processors is avoiding other bottlenecks. Much of the challenge in constructing the software drivers for the MultiSPERT system lay in minimizing the bus conflicts (times when more than one board needed to communicate with the host processor) through pipelining strategies. To this end, the SPERT-host data transfer was reimplemented in hardware, giving a factor of 4 speedup to raw transfer rates. All boards built since May 1997 include this enhancement, which also gives a minor improvement when training smaller nets on a single-board system.

As is inevitable with work that 'pushes the envelope' in this way, we have uncovered many bugs in different parts of the whole system. Having isolated and worked around several problems with the commercial SBUS expansion boxes we are using to attach the boards to the host, we are currently tracking down the cause for intermittent problems that appeared in our two multi-SPERT systems as we started to use them for routine training. We hope to be able to overcome these problems shortly.


next up previous contents
Next: Task 6.2: Software Up: WP6: Hardware and Software Previous: WP6: Milestones and Deliverables
Christophe Ris
1998-11-10