Systematic assessment of long-read RNA-seq methods for transcript identification and quantification.

TitleSystematic assessment of long-read RNA-seq methods for transcript identification and quantification.
Publication TypeJournal Article
Year of Publication2023
AuthorsPardo-Palacios FJ, Wang D, Reese F, Diekhans M, Carbonell-Sala S, Williams B, Loveland JE, De María M, Adams MS, Balderrama-Gutierrez G, Behera AK, Gonzalez JM, Hunt T, Lagarde J, Liang CE, Li H, Meade MJerryd, Amador DAMoraga, Prjibelski AD, Birol I, Bostan H, Brooks AM, Çelik MHasan, Chen Y, Du MRM, Felton C, Göke J, Hafezqorani S, Herwig R, Kawaji H, Lee J, Li JLiang, Lienhard M, Mikheenko A, Mulligan D, Nip KMing, Pertea M, Ritchie ME, Sim AD, Tang AD, Wan YKei, Wang C, Wong BY, Yang C, Barnes I, Berry A, Capella S, Dhillon N, Fernandez-Gonzalez JM, Ferrández-Peral L, Garcia-Reyero N, Goetz S, Hernández-Ferrer C, Kondratova L, Liu T, Martinez-Martin A, Menor C, Mestre-Tomás J, Mudge JM, Panayotova NG, Paniagua A, Repchevsky D, Rouchka E, Saint-John B, Sapena E, Sheynkman L, Smith MLaird, Suner M-M, Takahashi H, Youngworth IAshley, Carninci P, Denslow ND, Guigó R, Hunter ME, Tilgner HU, Wold BJ, Vollmers C, Frankish A, Au KFai, Sheynkman GM, Mortazavi A, Conesa A, Brooks AN
JournalbioRxiv
Date Published2023 Jul 27
Abstract

The Long-read RNA-Seq Genome Annotation Assessment Project (LRGASP) Consortium was formed to evaluate the effectiveness of long-read approaches for transcriptome analysis. The consortium generated over 427 million long-read sequences from cDNA and direct RNA datasets, encompassing human, mouse, and manatee species, using different protocols and sequencing platforms. These data were utilized by developers to address challenges in transcript isoform detection and quantification, as well as de novo transcript isoform identification. The study revealed that libraries with longer, more accurate sequences produce more accurate transcripts than those with increased read depth, whereas greater read depth improved quantification accuracy. In well-annotated genomes, tools based on reference sequences demonstrated the best performance. When aiming to detect rare and novel transcripts or when using reference-free approaches, incorporating additional orthogonal data and replicate samples are advised. This collaborative study offers a benchmark for current practices and provides direction for future method development in transcriptome analysis.

DOI10.1101/2023.07.25.550582
Alternate JournalbioRxiv
PubMed ID37546854
PubMed Central IDPMC10402094
Grant ListR01 HG008759 / HG / NHGRI NIH HHS / United States
R01 GM136886 / GM / NIGMS NIH HHS / United States
R35 GM138122 / GM / NIGMS NIH HHS / United States
R35 GM142647 / GM / NIGMS NIH HHS / United States
U41 HG007234 / HG / NHGRI NIH HHS / United States
/ WT_ / Wellcome Trust / United Kingdom
UM1 HG009443 / HG / NHGRI NIH HHS / United States
R01 HG011469 / HG / NHGRI NIH HHS / United States
F31 HG010999 / HG / NHGRI NIH HHS / United States