Correcting finite sampling issues in entropy l-diversity

3Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In statistical disclosure control (SDC) anonymized versions of a database table are obtained via generalization and suppression to reduce de-anonymization attacks, ideally with minimal utility loss. This amounts to an optimization problem in which a measure of remaining diversity needs to be improved. The feasible solutions are those that fulfill some privacy criteria, e.g., the entropy l-diversity. In the statistics it is known that the naive computation of an entropy via the Shannon formula systematically underestimates the (real) entropy and thus influences the resulting equivalence classes. In this contribution we implement an asymptotically unbiased estimator for the Shannon entropy and apply it to three test databases. Our results show previously performed systematic miscalculations; we show that by an unbiased estimator one can increase the utility of the data without compromising privacy.

Cite

CITATION STYLE

APA

Stammler, S., Katzenbeisser, S., & Hamacher, K. (2016). Correcting finite sampling issues in entropy l-diversity. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9867 LNCS, 135–146. https://doi.org/10.1007/978-3-319-45381-1_11

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free