Title | RExPRT: a machine learning tool to predict pathogenicity of tandem repeat loci. |
Publication Type | Journal Article |
Year of Publication | 2024 |
Authors | Fazal S, Danzi MC, Xu I, Kobren SNadimpalli, Sunyaev S, Reuter C, Marwaha S, Wheeler M, Dolzhenko E, Lucas F, Wuchty S, Tekin M, Züchner S, Aguiar-Pulido V |
Journal | Genome Biol |
Volume | 25 |
Issue | 1 |
Pagination | 39 |
Date Published | 2024 Jan 31 |
ISSN | 1474-760X |
Keywords | Machine Learning, Tandem Repeat Sequences, Virulence |
Abstract | Expansions of tandem repeats (TRs) cause approximately 60 monogenic diseases. We expect that the discovery of additional pathogenic repeat expansions will narrow the diagnostic gap in many diseases. A growing number of TR expansions are being identified, and interpreting them is a challenge. We present RExPRT (Repeat EXpansion Pathogenicity pRediction Tool), a machine learning tool for distinguishing pathogenic from benign TR expansions. Our results demonstrate that an ensemble approach classifies TRs with an average precision of 93% and recall of 83%. RExPRT's high precision will be valuable in large-scale discovery studies, which require prioritization of candidate loci for follow-up studies. |
DOI | 10.1186/s13059-024-03171-4 |
Alternate Journal | Genome Biol |
PubMed ID | 38297326 |
PubMed Central ID | PMC10832122 |
Grant List | R01 NS072248 / NS / NINDS NIH HHS / United States |