Automatic segmentation and labeling

In the database labeling process we started from the baseline system. Again we used an iterative alignment/training process as described earlier. A few iterations of this process have already been made with the results presented in Table 1.11. In the total we had 10 speakers (5 male and 5 female) with 812 utterances. We chose 652 for training (corresponding to 8 speakers) and 160 for validation (corresponding to 2 speakers). In the following table we present results on the training and validation sets.

Table: Evolution of the alignment/training process with the first 10 speakers of the PÚBLICO database.
  % correct frames % correct frames
Iteration on training set on validation set
1 61.61 63.60
2 66.30 66.07
3 67.58 67.93
4 68.00 68.14

This alignment/training process is still in progress.


Christophe Ris