domingo, 6 de julio de 2014

Privacy preserving protocol for detecting genetic relatives using rare variants

full-text ►

Privacy preserving protocol for detecting genetic relatives using rare variants

Privacy preserving protocol for detecting genetic relatives using rare variants

  1. Eleazar Eskin1,4,*
+Author Affiliations
  1. 1Department of Computer Science, 2Bioinformatics IDP, 3Department of Mathematics and 4Department of Human Genetics, University of California, LA 90095, USA
  1. *To whom correspondence should be addressed.


Motivation: High-throughput sequencing technologies have impacted many areas of genetic research. One such area is the identification of relatives from genetic data. The standard approach for the identification of genetic relatives collects the genomic data of all individuals and stores it in a database. Then, each pair of individuals is compared to detect the set of genetic relatives, and the matched individuals are informed. The main drawback of this approach is the requirement of sharing your genetic data with a trusted third party to perform the relatedness test.
Results: In this work, we propose a secure protocol to detect the genetic relatives from sequencing data while not exposing any information about their genomes. We assume that individuals have access to their genome sequences but do not want to share their genomes with anyone else. Unlike previous approaches, our approach uses both common and rare variants which provide the ability to detect much more distant relationships securely. We use a simulated data generated from the 1000 genomes data and illustrate that we can easily detect up to fifth degree cousins which was not possible using the existing methods. We also show in the 1000 genomes data with cryptic relationships that our method can detect these individuals.
Availability: The software is freely available for download at
Supplementary information: Supplementary data are available atBioinformatics online


Detecting relatives from genetic data is one of the fundamental problems in genetics. As genotype-chip technologies reduce the cost of collecting genetic data for each individual, many personal genomic companies provide various services. One such service is the identification of relatives using genetic data. The underling idea of this service is to collect genotypes of different individuals and to store their data in a database. Then, the genotype for each pair of individuals is compared and any pair of individuals that appear to be genetically related are notified of a match.

No hay comentarios: