The massive volume of content generated by social media greatly exceeds human capacity to manually process this data in order to identify topics of interest. As a solution, various automated topic detection approaches have been proposed, most of which are based on document clustering and burst detection. These approaches normally represent textual features in standard n-dimensional Euclidean metric spaces. However, in these cases, directly filtering noisy documents is challenging for topic detection. Instead we propose Topol, a topic detection method based on Topology Data Analysis (TDA) that transforms the Euclidean feature space into a topological space where the shapes of noisy irrelevant documents are much easier to distinguish from topically-relevant documents. This topological space is organised in a network according to the connectivity of the points, i.e. the documents, and by only filtering based on the size of the connected components we obtain competitive results compared to other state of the art topic detection methods.
CITATION STYLE
Torres-Tramón, P., Hromic, H., & Heravi, B. R. (2015). Topic detection in twitter using topology data analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9396, pp. 186–197). Springer Verlag. https://doi.org/10.1007/978-3-319-24800-4_16
Mendeley helps you to discover research relevant for your work.