Privacy preserving protocol for detecting genetic relatives using rare variants
Privacy preserving protocol for detecting genetic relatives using rare variants
- Farhad Hormozdiari1,*,†,
- Jong Wha J Joo2,†,
- Akshay Wadia1,
- Feng Guan3,
- Rafail Ostrosky1,
- Amit Sahai1 and
- Eleazar Eskin1,4,*
+Author Affiliations
- ↵*To whom correspondence should be addressed.
Abstract
Motivation: High-throughput sequencing technologies have impacted many areas of genetic research. One such area is the identification of relatives from genetic data. The standard approach for the identification of genetic relatives collects the genomic data of all individuals and stores it in a database. Then, each pair of individuals is compared to detect the set of genetic relatives, and the matched individuals are informed. The main drawback of this approach is the requirement of sharing your genetic data with a trusted third party to perform the relatedness test.
Results: In this work, we propose a secure protocol to detect the genetic relatives from sequencing data while not exposing any information about their genomes. We assume that individuals have access to their genome sequences but do not want to share their genomes with anyone else. Unlike previous approaches, our approach uses both common and rare variants which provide the ability to detect much more distant relationships securely. We use a simulated data generated from the 1000 genomes data and illustrate that we can easily detect up to fifth degree cousins which was not possible using the existing methods. We also show in the 1000 genomes data with cryptic relationships that our method can detect these individuals.
Availability: The software is freely available for download athttp://genetics.cs.ucla.edu/crypto/.
Supplementary information: Supplementary data are available atBioinformatics online
1 INTRODUCTION
Detecting relatives from genetic data is one of the fundamental problems in genetics. As genotype-chip technologies reduce the cost of collecting genetic data for each individual, many personal genomic companies provide various services. One such service is the identification of relatives using genetic data. The underling idea of this service is to collect genotypes of different individuals and to store their data in a database. Then, the genotype for each pair of individuals is compared and any pair of individuals that appear to be genetically related are notified of a match.
No hay comentarios:
Publicar un comentario