In the medical field, the rapid growth of medical equipment produced a large amount of medical data which has a wide range of sources and complex structures. Besides, medical data contains essential information that contributes to data exploration. However, the existing platforms based on Data Warehouse or Data Lake cannot effectively integrate more comprehensive multi-source heterogeneous medical data and efficiently manage large-scale multi-modal medical data. This paper presents a Multi-source Heterogeneous Data of Medical Lakehouse (MHDML), the platform that integrates multiple pieces of open-source software reasonably to integrate more comprehensive multi-source heterogeneous medical data. Multi-modal data fusion is an important method of the platform to improve multi-modal data management in the medical field. Finally, we customize Restful APIs for medical data exploration tasks. Based on the real data of sepsis and knee osteoarthritis, the platform realizes more comprehensive multi-source heterogeneous medical data acquisition and effective multi-modal medical data management, providing simple operations and visual data exploration functions for medical staff.
CITATION STYLE
Xiao, Q., Zheng, W., Mao, C., Hou, W., Lan, H., Han, D., … Sheng, M. (2022). MHDML: Construction of a Medical Lakehouse for Multi-source Heterogeneous Data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13705 LNCS, pp. 127–135). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-20627-6_12
Mendeley helps you to discover research relevant for your work.