The appearance of various platforms such as YouTube, Dailymotion and Google Video has a major role in the increasing of the number of videos available on the Internet. For example, more than 15000 video sequences are seen every day on Dailymotion. Consequently, the huge gathered amount of data constitutes a big scientific challenge for managing the underlying knowledge. Particularly, data summarization aims to extract concise abstracts from different types of documents. In the context of this paper, we are interested in summarizing meetings’ data. As the quality of video analyzing’s output highly depends on the type of data, we propose to establish our own framework for this end. The main goal of our study is to use textual data extracted from Automatic Speech Recognition (ASR) transcriptions of the AMI corpus to give a fully unsupervised summarized version of meeting sequences. Our contribution, called Weighted Histogram for ASR Transcriptions (WHASRT), adopts an extractive, free of annotations and dictionary-based approach. An exhaustive comparative study demonstrates that our method ensured competitive results with the ranking-based methods. The experimental results showed an enhanced performance over the existing clustering-based methods.
CITATION STYLE
Dammak, N., & BenAyed, Y. (2021). Histogram based method for unsupervised meeting speech summarization. In Advances in Intelligent Systems and Computing (Vol. 1181 AISC, pp. 396–405). Springer. https://doi.org/10.1007/978-3-030-49342-4_38
Mendeley helps you to discover research relevant for your work.