martes, 29 de octubre de 2013

Privacy-preserving heterogeneous healt... [J Am Med Inform Assoc. 2013] - PubMed - NCBI

Privacy-preserving heterogeneous healt... [J Am Med Inform Assoc. 2013] - PubMed - NCBI

J Am Med Inform Assoc. 2013 May 1;20(3):462-9. doi: 10.1136/amiajnl-2012-001027. Epub 2012 Dec 13.

Privacy-preserving heterogeneous health data sharing.

Source

Department of Computer Science and Software Engineering, Concordia University, Montreal, Quebec, Canada. no_moham@cse.concordia.ca

Abstract

OBJECTIVE:

Privacy-preserving data publishing addresses the problem of disclosing sensitive data when mining for useful information. Among existing privacy models, ε-differential privacy provides one of the strongest privacy guarantees and makes no assumptions about an adversary's background knowledge. All existing solutions that ensure ε-differential privacy handle the problem of disclosing relational and set-valued data in a privacy-preserving manner separately. In this paper, we propose an algorithm that considers both relational and set-valued data in differentially private disclosure of healthcare data.

METHODS:

The proposed approach makes a simple yet fundamental switch in differentially private algorithm design: instead of listing all possible records (ie, a contingency table) for noise addition, records are generalized before noise addition. The algorithm first generalizes the raw data in a probabilistic way, and then adds noise to guarantee ε-differential privacy.

RESULTS:

We showed that the disclosed data could be used effectively to build a decision tree induction classifier. Experimental results demonstrated that the proposed algorithm is scalable and performs better than existing solutions for classification analysis.

LIMITATION:

The resulting utility may degrade when the output domain size is very large, making it potentially inappropriate to generate synthetic data for large health databases.

CONCLUSIONS:

Unlike existing techniques, the proposed algorithm allows the disclosure of health data containing both relational and set-valued data in a differentially private manner, and can retain essential information for discriminative analysis.

PMID:
23242630
[PubMed - indexed for MEDLINE]
PMCID:
PMC3628047
[Available on 2014/5/8]

No hay comentarios: