A Few Observations

Examining the generated statistics, we observe that sub-band transitions do indeed occur asynchronously. More precisely:

Transition lags (with respect to the full-band transition boundaries) have a Gaussian distribution, with a mean close to zero, indicating that on average the transition lags happen in both directions, and a standard deviation of [2.8, 3.3, 5.0, 5.6] frames for the sub-bands, respectively. The higher the frequency range, the more shifted are the transition boundaries compared to the full-band.

More distant sub-bands have less agreement in transition boundaries, as the $\sigma$ of transition lags between sub-bands 1 and 4 is 5.9 frames, and between sub-bands 1 and 2 is 3.8 frames.

30% of the sub-band transitions do not occur within 50 ms of each other.

Some broad category transitions are sharp (e.g., sil $\rightarrow$ stop), and some have a relatively flat distribution (e.g., vowel $\rightarrow$ liquid).

