The objective of this paper is to propose some of the best storage practices for using Spatial Big data on the Data Lakehouse. In fact, handling Big Spatial Data showed the limits of current approaches to store massive spatial data, either traditional such as geographic information systems or new ones such as extensions of augmented Big Data approaches. Our article is divided into four parts. In the first part, we will give a brief background of the data management system scene. In the second part, we will present the Data LakeHouse and how it responds to the problems of storage, processing and exploitation of big data while ensuring consistency and efficiency as in data warehouses. Then, we will recall the constraints posed by the management of Big Spatial Data. We end our paper with an experimental study showing the best storage practice for Spatial Big data on the Data LakeHouse. Our experiment shows that the partitioning of Spatial Big data over Geohash index is an optimal solution for the storage.
CITATION STYLE
Errami, S. A., Hajji, H., Kadi, K. A. E., & Badir, H. (2023). Managing Spatial Big Data on the Data LakeHouse. In Lecture Notes on Data Engineering and Communications Technologies (Vol. 147, pp. 323–331). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-15191-0_31
Mendeley helps you to discover research relevant for your work.