In terms of the Portuguese language the main goal in this project is to develop a speaker independent, large vocabulary continuous speech recognition system. To accomplish this we need an adequate database with both speech and text. Also we need an appropriate phoneme segmentation of that database or some tools that will enable us to do it.
This database is being collected under Task 1.1. To overcome this problem we began developing a baseline system for Portuguese based on the much smaller SAM database which is currently available. The baseline system will allow us to develop the basic structures necessary to build a large system.