ISCA Archive Interspeech 2012
ISCA Archive Interspeech 2012

Automatic speech segmentation using probabilistic latent component modeling

Sayan Ghosh, Thippur V. Sreenivas

Latent variable methods, such as PLCA (Probabilistic Latent Component Analysis) have been successfully used for analysis of non-negative signal representations.In this paper, we formulate PLCS (Probabilistic Latent Component Segmentation), which models each time frame of a spectrogram as a spectral distribution. Given the signal spectrogram, the segmentation boundaries are estimated using a maximum-likelihood approach. For an efficient solution, the algorithm imposes a hard constraint that each segment is modelled by a single latent component. The hard constraint facilitates the solution of ML boundary estimation using dynamic programming. The PLCS framework does not impose a parametric assumption unlike earlier ML segmentation techniques. PLCS can be naturally extended to model coarticulation between successive phones. Experiments on the TIMIT corpus show that the proposed technique is promising compared to most state of the art speech segmentation algorithms.

Index Terms: Speech segmentation, PLCA, Spectrograms, Coarticulation, Dynamic Programming


doi: 10.21437/Interspeech.2012-594

Cite as: Ghosh, S., Sreenivas, T.V. (2012) Automatic speech segmentation using probabilistic latent component modeling. Proc. Interspeech 2012, 2262-2265, doi: 10.21437/Interspeech.2012-594

@inproceedings{ghosh12_interspeech,
  author={Sayan Ghosh and Thippur V. Sreenivas},
  title={{Automatic speech segmentation using probabilistic latent component modeling}},
  year=2012,
  booktitle={Proc. Interspeech 2012},
  pages={2262--2265},
  doi={10.21437/Interspeech.2012-594}
}