A Text-Independent Forced Alignment Method for Automatic Phoneme Segmentation

2Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Phoneme segmentation is important for many healthcare applications, such as the diagnosis and monitoring of children with speech sound disorders (SSDs). This is usually addressed by performing forced alignment (FA), which essentially annotates an audio file to provide information on what has been uttered and where. While many FA tools exist, very few can work automatically without the assistance of a transcription. This work aims at providing a novel text-independent FA tool by using two models, namely wav2vec 2.0 and an unsupervised segmentor known as UnsupSeg. To provide labels to the segments, the class regions that are obtained by nearest-neighbour classification with wav2vec 2.0 labels pre-CTC collapse as the reference points. Maximal overlap between the class regions and the segments determines class label. Additional post-processing steps, such as over-fitting cleaning and application of voice activity detection, are also performed to further improve the segmentation performance. All the models used to create the tool are self-supervised, and thus can leverage great amounts of unlabelled data to reduce the need for labelled data. When evaluated on the TIMIT dataset, our implementation achieved a harmonic mean score of 76.88%, competitive against other alternatives.

Cite

CITATION STYLE

APA

Wohlan, B., Pham, D. S., Chan, K. Y., & Ward, R. (2022). A Text-Independent Forced Alignment Method for Automatic Phoneme Segmentation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13728 LNAI, pp. 585–598). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-22695-3_41

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free