Abstract
Metazoan genomes produce thousands of long-noncoding RNAs (lncRNAs), of which just a small fraction have been well characterized. Understanding their biological functions requires accurate annotations, or maps of the precise location and structure of genes and transcripts in the genome. Current lncRNA annotations are limited by compromises between quality and size, with many gene models being fragmentary or uncatalogued. To overcome this, the GENCODE consortium has developed RNA capture long-read sequencing (CLS), an approach combining targeted RNA capture with third-generation long-read sequencing. CLS provides accurate annotations at high-throughput rates. It eliminates the need for noisy transcriptome assembly from short reads, and requires minimal manual curation. The full-length transcript models produced are of quality comparable to present-day manually curated annotations. Here we describe a detailed CLS protocol, from probe design through long-read sequencing to creation of final annotations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Derrien T, Johnson R, Bussotti G et al (2012) The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res 22:1775–1789. https://doi.org/10.1101/gr.132159.111
Ulitsky I, Shkumatava A, Jan CH et al (2011) Conserved function of lincRNAs in vertebrate embryonic development despite rapid sequence evolution. Cell 147:1537–1550. https://doi.org/10.1016/j.cell.2011.11.055
Lanzós A, Carlevaro-Fita J, Mularoni L et al (2017) Discovery of cancer driver long noncoding RNAs across 1112 tumour genomes: new candidates and distinguishing features. Sci Rep 7:1–16. https://doi.org/10.1038/srep41544
Quek XC, Thomson DW, Maag JLV et al (2015) lncRNAdb v2.0: expanding the reference database for functional long noncoding RNAs. Nucleic Acids Res 43:D168–D173. https://doi.org/10.1093/nar/gku988
Mattick JS (2018) The state of long non-coding RNA biology. Noncod RNA 4:E17. https://doi.org/10.1007/978-3-319-13689-9_2
Hezroni H, Koppstein D, Schwartz MG et al (2015) Principles of long noncoding rna evolution derived from direct comparison of transcriptomes in 17 species. Cell Rep 11:1110–1122. https://doi.org/10.1016/j.celrep.2015.04.023
Uszczynska-Ratajczak B, Lagarde J, Frankish A et al (2018) Towards a complete map of the human long non-coding RNA transcriptome. Nat Rev Genet 19:535–548. https://doi.org/10.1038/s41576-018-0017-y
Sanson KR, Hanna RE, Hegde M et al (2018) Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities. Nat Commun 9:1–15. https://doi.org/10.1038/s41467-018-07901-8
Fang S, Zhang L, Guo J et al (2018) NONCODEV5: a comprehensive annotation database for long non-coding RNAs. Nucleic Acids Res 46:D308–D314. https://doi.org/10.1093/nar/gkx1107
Lagarde J, Uszczynska-Ratajczak B, Santoyo-Lopez J et al (2016) Extension of human lncRNA transcripts by RACE coupled with long-read high-throughput sequencing (RACE-Seq). Nat Commun 7:12339. https://doi.org/10.1038/ncomms12339
Hansen KD, Brenner SE, Dudoit S (2010) Biases in Illumina transcriptome sequencing caused by random hexamer priming. Nucleic Acids Res 38:1–7. https://doi.org/10.1093/nar/gkq224
Steijger T, Abril JF, Engström PG et al (2013) Assessment of transcript reconstruction methods for RNA-seq. Nat Methods 10:1177–1184. https://doi.org/10.1038/nmeth.2714
Frankish A, Diekhans M, Ferreira A-M et al (2019) GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res 47:D766–D773. https://doi.org/10.1093/nar/gky955
Pruitt KD, Brown GR, Hiatt SM et al (2014) RefSeq: an update on mammalian reference sequences. Nucleic Acids Res 42:756–763. https://doi.org/10.1093/nar/gkt1114
Sharon D, Tilgner H, Grubert F, Snyder M (2013) A single-molecule long-read survey of the human transcriptome. Nat Biotechnol 31:1009–1014. https://doi.org/10.1038/nbt.2705
Tilgner H, Raha D, Habegger L et al (2013) Accurate identification and analysis of human mRNA isoforms using deep long read sequencing. G3 3:387–397. https://doi.org/10.1534/g3.112.004812
Jain M, Olsen HE, Paten B, Akeson M (2016) The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol 17:1–11. https://doi.org/10.1186/s13059-016-1122-x
Crider-Miller SJ, Reid LH, Higgins MJ, Nowak NJ, Shows TB, PAF and BEW (1997) Novel transcribed sequences within the BWS/WT2 region in 11p15.5: tissue-specific expression correlates with cancer type. Genomics 46:355–363. https://doi.org/10.1006/geno.1997.5061
Mercer TR, Gerhardt DJ, Dinger ME et al (2012) Targeted RNA sequencing reveals the deep complexity of the human transcriptome. Nat Biotechnol 30:99–104. https://doi.org/10.1038/nbt.2024
Mercer TR, Clark MB, Crawford J et al (2014) Targeted sequencing for gene discovery and quantification using RNA CaptureSeq. Nat Protoc 9:989–1009. https://doi.org/10.1038/nprot.2014.058
Clark MB, Mercer TR, Bussotti G et al (2015) Quantitative gene profiling of long noncoding RNAs with targeted RNA sequencing. Nat Methods 12:339–342. https://doi.org/10.1038/nmeth.3321
Bussotti G, Leonardi T, Clark MB et al (2016) Improved definition of the mouse transcriptome via targeted RNA sequencing. Genome Res 26:705–716. https://doi.org/10.1101/gr.199760.115
Hardwick SA, Chen WY, Wong T et al (2016) Spliced synthetic genes as internal controls in RNA sequencing experiments. Nat Methods 13:792–798. https://doi.org/10.1038/nmeth.3958
Lagarde J, Uszczynska-Ratajczak B, Carbonell S et al (2017) High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing. Nat Genet 49:1731–1740. https://doi.org/10.1038/ng.3988
Deveson IW, Brunck ME, Blackburn J et al (2018) Universal alternative splicing of noncoding exons. Cell Syst 6:245–255.e5. https://doi.org/10.1016/j.cels.2017.12.005
Acknowledgments
Sílvia Carbonell Sala and Barbara Uszczyńska-Ratajczak contributed equally to this work.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Carbonell Sala, S., Uszczyńska-Ratajczak, B., Lagarde, J., Johnson, R., Guigó, R. (2021). Annotation of Full-Length Long Noncoding RNAs with Capture Long-Read Sequencing (CLS). In: Cao, H. (eds) Functional Analysis of Long Non-Coding RNAs. Methods in Molecular Biology, vol 2254. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1158-6_9
Download citation
DOI: https://doi.org/10.1007/978-1-0716-1158-6_9
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-1157-9
Online ISBN: 978-1-0716-1158-6
eBook Packages: Springer Protocols