Current System

As mentioned earlier our final goal is to build a speaker independent, large vocabulary continuous speech recognition system for the European Portuguese language. There is a long way from the baseline system to our final goal. In the work reported in this section we are using a small initial part of the PÚBLICO database, collected to validate and test the overall process. We used a small setup set with 10 speakers (5 male and 5 female) and 812 utterances. This set was selected from the training part and used in the validation of the overall process of sentence selection and recording. This set represents approximately 10% of the total training set. It is still a very small database, but larger than SAM, and is useful for the development of the alignment and training process of the current system.

