Assessing performance of pathogenicity predictors using clinically relevant variant datasets
Affiliations
- PMID: 32843488
- DOI: 10.1136/jmedgenet-2020-107003
Abstract
Background: Pathogenicity predictors are integral to genomic variant interpretation but, despite their widespread usage, an independent validation of performance using a clinically relevant dataset has not been undertaken.
Methods: We derive two validation datasets: an 'open' dataset containing variants extracted from publicly available databases, similar to those commonly applied in previous benchmarking exercises, and a 'clinically representative' dataset containing variants identified through research/diagnostic exome and panel sequencing. Using these datasets, we evaluate the performance of three recent meta-predictors, REVEL, GAVIN and ClinPred, and compare their performance against two commonly used in silico tools, SIFT and PolyPhen-2.
Results: Although the newer meta-predictors outperform the older tools, the performance of all pathogenicity predictors is substantially lower in the clinically representative dataset. Using our clinically relevant dataset, REVEL performed best with an area under the receiver operating characteristic curve of 0.82. Using a concordance-based approach based on a consensus of multiple tools reduces the performance due to both discordance between tools and false concordance where tools make common misclassification. Analysis of tool feature usage may give an insight into the tool performance and misclassification.
Conclusion: Our results support the adoption of meta-predictors over traditional in silico tools, but do not support a consensus-based approach as in current practice.
Keywords: genetic testing; genetic variation; genetics; genomics; human genetics.
© Author(s) (or their employer(s)) 2020. Re-use permitted under CC BY. Published by BMJ.
Conflict of interest statement
Competing interests: None declared.
Similar articles
- REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants.Am J Hum Genet. 2016 Oct 6;99(4):877-885. doi: 10.1016/j.ajhg.2016.08.016. Epub 2016 Sep 22.PMID: 27666373 Free PMC article.
- REVEL and BayesDel outperform other in silico meta-predictors for clinical variant classification.Sci Rep. 2019 Sep 4;9(1):12752. doi: 10.1038/s41598-019-49224-8.PMID: 31484976 Free PMC article.
- ClinPred: Prediction Tool to Identify Disease-Relevant Nonsynonymous Single-Nucleotide Variants.Am J Hum Genet. 2018 Oct 4;103(4):474-483. doi: 10.1016/j.ajhg.2018.08.005. Epub 2018 Sep 13.PMID: 30220433 Free PMC article.
- Integrating massively parallel sequencing into diagnostic workflows and managing the annotation and clinical interpretation challenge.Hum Mutat. 2014 Apr;35(4):413-23. doi: 10.1002/humu.22525. Epub 2014 Mar 6.PMID: 24510514 Review.
- Review of current methods, applications, and data management for the bioinformatics analysis of whole exome sequencing.Cancer Inform. 2014 Sep 21;13(Suppl 2):67-82. doi: 10.4137/CIN.S13779. eCollection 2014.PMID: 25288881 Free PMC article. Review.
No hay comentarios:
Publicar un comentario