viernes, 18 de enero de 2013

Identifying Personal Genomes by Surname Inference

Identifying Personal Genomes by Surname Inference

Science
Vol. 339 no. 6117 pp. 321-324
DOI: 10.1126/science.1229566
  • Report

Identifying Personal Genomes by Surname Inference

  1. Yaniv Erlich1,*
+ Author Affiliations
  1. 1Whitehead Institute for Biomedical Research, 9 Cambridge Center, Cambridge, MA 02142, USA.
  2. 2Harvard–Massachusetts Institute of Technology (MIT) Division of Health Sciences and Technology, MIT, Cambridge, MA 02139, USA.
  3. 3Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
  4. 4Department of Molecular Biology and Diabetes Unit, Massachusetts General Hospital, Boston, MA 02114, USA.
  5. 5Center for Medical Ethics and Health Policy, Baylor College of Medicine, Houston, TX 77030, USA.
  6. 6Department of Statistics and Operations Research, Tel Aviv University, Tel Aviv 69978, Israel.
  7. 7School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel.
  8. 8Department of Molecular Microbiology and Biotechnology, Tel-Aviv University, Tel Aviv 69978, Israel.
  9. 9The International Computer Science Institute, Berkeley, CA 94704, USA.
  1. *To whom correspondence should be addressed. E-mail: yaniv@wi.mit.edu

Abstract

Sharing sequencing data sets without identifiers has become a common practice in genomics. Here, we report that surnames can be recovered from personal genomes by profiling short tandem repeats on the Y chromosome (Y-STRs) and querying recreational genetic genealogy databases. We show that a combination of a surname with other types of metadata, such as age and state, can be used to triangulate the identity of the target. A key feature of this technique is that it entirely relies on free, publicly accessible Internet resources. We quantitatively analyze the probability of identification for U.S. males. We further demonstrate the feasibility of this technique by tracing back with high probability the identities of multiple participants in public sequencing projects.
  • Received for publication 31 August 2012.
  • Accepted for publication 3 December 2012.

Read the Full Text

THIS ARTICLE HAS BEEN CITED BY OTHER ARTICLES:

 

No hay comentarios: