Column generation bounds for numerical microaggregation
The biggest challenge when disclosing private data is to share information contained in databases while protecting people from being individually identified. Microaggregation is a family of methods for statistical disclosure control. The principle of microaggregation is that confidentiality rules pe...
Gespeichert in:
Veröffentlicht in: | Journal of global optimization 2014-10, Vol.60 (2), p.165-182 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The biggest challenge when disclosing private data is to share information contained in databases while protecting people from being individually identified. Microaggregation is a family of methods for statistical disclosure control. The principle of microaggregation is that confidentiality rules permit the publication of individual records if they are partitioned into groups of size larger or equal to a fixed threshold value, where none is more representative than the others in the same group. The application of such rules leads to replacing individual values by those computed from small groups (microaggregates), before data publication. This work proposes a column generation algorithm for numerical microaggregation in which its pricing problem is solved by a specialized branch-and-bound. The algorithm is able to find, for the first time, lower bounds for instances of three real-world datasets commonly used in the literature. Furthermore, new best known solutions are obtained for these instances by means of a simple heuristic method with the columns generated. |
---|---|
ISSN: | 0925-5001 1573-2916 |
DOI: | 10.1007/s10898-014-0149-3 |