With the explosive growth of mobile traffic and the demand for privacy protection and network security, mainstream mobile applications use encryption protocols (mostly TLS), so identifying mobile encrypted traffic has become critical. Machine learning methods are based on hand-designed features that are unreliable in the face of complex traffic data. Deep learning currently performs well on this task, but most of them only describe traffic data from one view, ignoring the heterogeneous nature of traffic. In this paper, we apply multimodal Transformers to mobile encrypted traffic classification and propose a novel model (DF-Net) with a deep fusion mechanism. The key point of deep fusion is that a learnable modal-type embedding enables the model to perform early and unconstrained fusion and interaction of cross-modal information to achieve performance improvements. On the premise of ensuring performance, DF-Net adopts lightweight design and the parallel mechanism to improve the overall efficiency of the model. To verify the performance and efficiency of DF-Net, we implement an automated traffic collection framework to collect a real-world traffic dataset that covers 48 popular apps. Experiments show that DF-Net not only achieves excellent performance but also more efficient compared to state-of-the-art methods.
CITATION STYLE
Ding, S., Xu, Y., Xu, H., Deng, H., & Ge, J. (2022). A Multimodal Deep Fusion Network for Mobile Traffic Classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13472 LNCS, pp. 384–392). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-19214-2_32
Mendeley helps you to discover research relevant for your work.