A Bottom-up k-anonymization approach for big data publishing
Abstract
As governments and other organizations share larger datasets, keeping individual information private has become increasingly difficult to solve. When publishing the data, data anonymization models like k-anonymity and l-diversity are employed to ensure the trade-off between privacy and data utility. This paper presents a method called Bottom-Up k-anonymization (BU-K), implemented on Apache Spark. It improves efficiency by applying the Bottom-Up Generalization (BUG) approach. BU-KC performs better than Top-Down Specialization (TDS) in terms of scalability, and data privacy, while still keeping the data useful. Moreover, using Apache Spark’s distributed computing architecture significantly improves processing time compared to traditional MapReduce approaches. This work fills a gap in distributed anonymization on Spark by offering a new, efficient, and scalable solution
Downloads
Copyright (c) 2026 ITEGAM-JETIA

This work is licensed under a Creative Commons Attribution 4.0 International License.








