Journal Home > Volume 5 , Issue 1

Intron Retention (IR) is an alternative splicing mode through which introns are retained in mature RNAs rather than being spliced in most cases. IR has been gaining increasing attention in recent years because of its recognized association with gene expression regulation and complex diseases. Continuous efforts have been dedicated to the development of IR detection methods. These methods differ in their metrics to quantify retention propensity, performance to detect IR events, functional enrichment of detected IRs, and computational speed. A systematic experimental comparison would be valuable to the selection and use of existing methods. In this work, we conduct an experimental comparison of existing IR detection methods. Considering the unavailability of a gold standard dataset of intron retention, we compare the IR detection performance on simulation datasets. Then, we compare the IR detection results with real RNA-Seq data. We also describe the use of differential analysis methods to identify disease-associated IRs and compare differential IRs along with their Gene Ontology enrichment, which is illustrated on an Alzheimer’s disease RNA-Seq dataset. We discuss key principles and features of existing approaches and outline their differences. This systematic analysis provides helpful guidance for interrogating transcriptomic data from the point of view of IR.


menu
Abstract
Full text
Outline
About this article

A Comparison of Computational Approaches for Intron Retention Detection

Show Author's information Jiantao Zheng1,Cuixiang Lin1,Zhenpeng Wu1Hong-Dong Li1( )
Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China

† Jiantao Zheng and Cuixiang Lin contributed equally to this paper.

Abstract

Intron Retention (IR) is an alternative splicing mode through which introns are retained in mature RNAs rather than being spliced in most cases. IR has been gaining increasing attention in recent years because of its recognized association with gene expression regulation and complex diseases. Continuous efforts have been dedicated to the development of IR detection methods. These methods differ in their metrics to quantify retention propensity, performance to detect IR events, functional enrichment of detected IRs, and computational speed. A systematic experimental comparison would be valuable to the selection and use of existing methods. In this work, we conduct an experimental comparison of existing IR detection methods. Considering the unavailability of a gold standard dataset of intron retention, we compare the IR detection performance on simulation datasets. Then, we compare the IR detection results with real RNA-Seq data. We also describe the use of differential analysis methods to identify disease-associated IRs and compare differential IRs along with their Gene Ontology enrichment, which is illustrated on an Alzheimer’s disease RNA-Seq dataset. We discuss key principles and features of existing approaches and outline their differences. This systematic analysis provides helpful guidance for interrogating transcriptomic data from the point of view of IR.

Keywords: gene expression, alternative splicing, intron retention, RNA-Seq

References(59)

[1]
A. R. Kornblihtt, I. E. Schor, M. Alló, G. Dujardin, E. Petrillo, and M. J. Muñoz, Alternative splicing: A pivotal step between eukaryotic transcription and translation, Nat. Rev. Mol. Cell Biol., vol. 14, no. 3, pp. 153-165, 2013.
[2]
S. Chaudhary, W. Khokhar, I. Jabre, A. S. N. Reddy, L. J. Byrne, C. M. Wilson, and N. H. Syed, Alternative splicing and protein diversity: Plants versus animals, Front. Plant Sci., vol. 10, p. 708, 2019.
[3]
F. E. Baralle and J. Giudice, Alternative splicing as a regulator of development and tissue identity, Nat. Rev. Mol. Cell Biol., vol. 18, no. 7, pp. 437-451, 2017.
[4]
S. A. Bhuiyan, S. Ly, M. Phan, B. Huntington, E. Hogan, C. C. Liu, J. Liu, and P. Pavlidis, Systematic evaluation of isoform function in literature reports of alternative splicing, BMC Genomics, vol. 19, no. 1, p. 37, 2018.
[5]
G. Biamonti, A. Amato, E. Belloni, A. Di Matteo, L. Infantino, D. Pradella, and C. Ghigna, Alternative splicing in Alzheimer’s disease, Aging Clin. Exp. Res., vol. 33, no. 4, pp. 747-758, 2019.
[6]
E. El Marabti and I. Younis, The cancer spliceome: Reprograming of alternative splicing in cancer, Front. Mol. Biosci., vol. 5, p. 80, 2018.
[7]
A. C. H. Wong, J. E. J. Rasko, and J. J. L. Wong, We skip to work: Alternative splicing in normal and malignant myelopoiesis, Leukemia, vol. 32, no. 5, pp. 1081-1093, 2018.
[8]
A. Paschalis, A. Sharp, J. C. Welti, A. Neeb, G. V. Raj, J. Luo, S. R. Plymate, and J. S. De Bono, Alternative splicing in prostate cancer, Nat. Rev. Clin. Oncol., vol. 15, no. 11, pp. 663-675, 2018.
[9]
E. Fraile-Bethencourt, A. Valenzuela-Palomo, B. Díez-Gómez, E. Goina, A. Acedo, E. Buratti, and E. A. Velasco, Mis-splicing in breast cancer: Identification of pathogenic BRCA2 variants by systematic minigene assays, J. Pathol., vol. 248, no. 4, pp. 409-420, 2019.
[10]
P. A. F. Galante, N. J. Sakabe, N. Kirschbaum-Slager, and S. J. De Souza, Detection and evaluation of intron retention events in the human transcriptome, RNA, vol. 10, no. 5, pp. 757-765, 2004.
[11]
N. J. Sakabe and S. J. De Souza, Sequence features responsible for intron retention in human, BMC Genomics, vol. 8, no. 1, p. 59, 2007.
[12]
R. Louro, A. S. Smirnova, and S. Verjovski-Almeida, Long intronic noncoding RNA transcription: Expression noise or expression choice? Genomics, vol. 93, no. 4, pp. 291-298, 2009.
[13]
C. Cenik, A. Derti, J. C. Mellor, G. F. Berriz, and F. P. Roth, Genome-wide functional analysis of human 5’ untranslated region introns, Genome Biol., vol. 11, no. 3, p. R29, 2010.
[14]
C. I. Castillo-Davis, S. L. Mekhedov, D. L. Hartl, E. V. Koonin, and F. A. Kondrashov, Selection for short introns in highly expressed genes, Nat. Genet., vol. 31, no. 4, pp. 415-418, 2002.
[15]
Q. Zhang, H. Li, H. Jin, H. B. Tan, J. Zhang, and S. T. Sheng, The global landscape of intron retentions in lung adenocarcinoma, BMC Med. Genomics, vol. 7, no. 1, p. 15, 2014.
[16]
D. Wang, J. Zavadil, L. Martin, F. Parisi, E. Friedman, D. Levy, H. Harding, D. Ron, and L. B. Gardner, Inhibition of nonsense-mediated RNA decay by the tumor microenvironment promotes tumorigenesis, Mol. Cell. Biol., vol. 31, no. 17, pp. 3670-3680, 2011.
[17]
C. T. Ong and S. Adusumalli, Increased intron retention is linked to Alzheimer’s disease, Neural Regen. Res., vol. 15, no. 2, pp. 259-260, 2020.
[18]
H. Jung, D. Lee, J. Lee, D. Park, Y. J. Kim, W. Y. Park, D. W. Hong, P. J. Park, and E. Lee, Intron retention is a widespread mechanism of tumor-suppressor inactivation, Nat. Genet., vol. 47, no. 11, pp. 1242-1248, 2015.
[19]
H. Dvinge and R. K. Bradley, Widespread intron retention diversifies most cancer transcriptomes, Genome Med., vol. 7, no. 1, p. 45, 2015.
[20]
S. R. Zhao, Alternative splicing, RNA-Seq and drug discovery, Drug Discov. Today, vol. 24, no. 6, pp. 1258-1267, 2019.
[21]
J. Feng, K. Chen, X. Dong, X. L. Xu, Y. X. Jin, X. Y. Zhang, W. B. Chen, Y. J. Han, L. Shao, Y. Gao, et al., Genome-wide identification of cancer-specific alternative splicing in circRNA, Mol. Cancer, vol. 18, no. 1, p. 35, 2019.
[22]
V. Van Giau, E. Bagyinszky, Y. S. Yang, Y. C. Youn, S. S. A. An, and S. Y. Kim, Genetic analyses of early-onset Alzheimer’s disease using next generation sequencing, Sci. Rep., vol. 9, no. 1, p. 8368, 2019.
[23]
Y. Bai, S. F. Ji, and Y. D. Wang, IRcall and IRclassifier: Two methods for flexible detection of intron retention events from RNA-Seq data, BMC Genomics, vol. 16, no. 2, p. S9, 2015.
[24]
H. Pimentel, J. G. Conboy, and L. Pachter, Keep me around: Intron retention detection and analysis, arXiv preprint arXiv: 1510.00696, 2015.
[25]
A. Roberts and L. Pachter, Streaming fragment assignment for real-time analysis of sequencing experiments, Nat. Methods, vol. 10, no. 1, pp. 71-73, 2013.
[26]
R. Middleton, D. D. Gao, A. Thomas, B. Singh, A. Au, J. J. L. Wong, A. Bomane, B. Cosson, E. Eyras, and J. E. J. Rasko, et al., IRFinder: Assessing the impact of intron retention on mammalian gene expression, Genome Biol., vol. 18, no. 1, p. 51, 2017.
[27]
H. D. Li, C. C. Funk, and N. D. Price, iREAD: A tool for intron retention detection from RNA-Seq data, BMC Genomics, vol. 21, no. 1, p. 128, 2020.
[28]
Y. Katz, E. T. Wang, E. M. Airoldi, and C. B. Burge, Analysis and design of RNA sequencing experiments for identifying isoform regulation, Nat. Methods, vol. 7, no. 12, pp. 1009-1015, 2010.
[29]
S. H. Shen, J. W. Park, J. Huang, K. A. Dittmar, Z. X. Lu, Q. Zhou, R. P. Carstens, and Y. Xing, MATS: A Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data, Nucleic Acids Res., vol. 40, no. 8, p. e61, 2012.
[30]
S. H. Shen, J. W. Park, Z. X. Lu, L. Lin, M. D. Henry, Y. N. Wu, Q. Zhou, and Y. Xing, rMATS: Robust and flexible detection of differential alternative splicing from replicate RNA-Seq data, Proc. Natl. Acad. Sci. USA, vol. 111, no. 51, pp. E5593-E5601, 2014.
[31]
S. Anders, A. Reyes, and W. Huber, Detecting differential usage of exons from RNA-Seq data, Nat. Prec., .
[32]
Y. F. Li, X. Y. Rao, W. W. Mattox, C. I. Amos, and B. Liu, RNA-Seq analysis of differential splice junction usage and intron retentions by DEXSeq, PLoS One, vol. 10, no. 9, p. e0136653, 2015.
[33]
W. W. Wu, J. Zong, N. Wei, J. Cheng, X. X. Zhou, Y. M. Cheng, D. Chen, Q. H. Guo, B. Zhang, and Y. Feng, CASH: A constructing comprehensive splice site method for detecting alternative splicing events, Brief. Bioinform., vol. 19, no. 5, pp. 905-917, 2018.
[34]
L. Broseus and W. Ritchie, Challenges in detecting and quantifying intron retention from next generation sequencing data, Comput. Struct. Biotechnol. J., vol. 18, pp. 501-508, 2020.
[35]
H. Li, B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin, and 1000 Genome Project Data Processing Subgroup, The sequence alignment/map format and SAMtools, Bioinformatics, vol. 25, no. 16, pp. 2078-2079, 2009.
[36]
G. R. Grant, M. H. Farkas, A. D. Pizarro, N. F. Lahens, J. Schug, B. P. Brunk, C. J. Stoeckert, J. B. Hogenesch, and E. A. Pierce, Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM), Bioinformatics, vol. 27, no. 18, pp. 2518-2528, 2011.
[37]
Y. I. Li, D. A. Knowles, J. Humphrey, A. N. Barbeira, S. P. Dickinson, H. K. Im, and J. K. Pritchard, Annotation-free quantification of RNA splicing using LeafCutter, Nat. Genetics, vol. 50, pp. 151-158, 2018.
[38]
H. D. Li, GTFtools: A Python package for analyzing various modes of gene models, bioRxiv, .
[39]
A. R. Quinlan and I. M. Hall, BEDTools: A flexible suite of utilities for comparing genomic features, Bioinformatics, vol. 26, no. 6, pp. 841-842, 2010.
[40]
M. D. Robinson, D. J. McCarthy, and G. K. Smyth, edgeR: A bioconductor package for differential expression analysis of digital gene expression data, Bioinformatics, vol. 26, no. 1, pp. 139-140, 2010.
[41]
G. C. Yu, L. G. Wang, Y. Y. Han, and Q. Y. He, clusterProfiler: An R package for comparing biological themes among gene clusters, OMICS: A J. Integr. Biol., vol. 16, no. 5, pp. 284-287, 2012.
[42]
M. R. Duggan, S. Joshi, Y. F. Tan, M. Slifker, E. A. Ross, M. Wimmer, and V. Parikh, Transcriptomic changes in the prefrontal cortex of rats as a function of age and cognitive engagement, Neurobiol. Learn. Mem., vol. 163, p. 107035, 2019.
[43]
A. De Lillo, G. A. Pathak, F. De Angelis, M. Di Girolamo, M. Luigetti, M. Sabatelli, F. Perfetto, S. Frusconi, D. Manfellotto, M. Fuciarelli, et al., Epigenetic profiling of Italian patients identified methylation sites associated with hereditary transthyretin amyloidosis, medRxiv, .
[44]
A. Hamosh, A. F. Scott, J. S. Amberger, C. A. Bocchini, and V. A. McKusick, Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders, Nucleic Acids Res., vol. 33, no. S1, pp. D514-D517, 2005.
[45]
C. H. Wu, R. Apweiler, A. Bairoch, D. A. Natale, W. C. Barker, B. Boeckmann, S. Ferro, E. Gasteiger, H. Z. Huang, R. Lopez, et al., The Universal Protein resource (UniProt): An expanding universe of protein information, Nucleic Acids Res., vol. 34, no. suppl_1, pp. D187-D191, 2006.
[46]
Z. X. Bai, G. C. Han, B. Xie, J. J. Wang, F. H. Song, X. Peng, and H. X. Lei, AlzBase: An integrative database for gene dysregulation in Alzheimer’s disease, Mol. Neurobiol., vol. 53, no. 1, pp. 310-319, 2016.
[47]
J. Piñero, À. Bravo, N. Queralt-Rosinach, A. Gutiérrez-Sacristán, J. Deu-Pons, E. Centeno, J. García-García, F. Sanz, and L. I. Furlong, DisGeNET: A comprehensive platform integrating information on human disease-associated genes and variants, Nucleic Acids Res., vol. 45, no. D1, pp. D833-D839, 2017.
[48]
D. P. Vanichkina, U. Schmitz, J. J. L. Wong, and J. E. J. Rasko, Challenges in defining the role of intron retention in normal biology and disease, Semin. Cell Dev. Biol., vol. 75, pp. 40-49, 2018.
[49]
A. C. Smart, C. A. Margolis, H. Pimentel, M. X. He, D. A. Miao, D. Adeegbe, T. Fugmann, K. K. Wong, and E. M. Van Allen, Intron retention as a novel source of cancer neoantigen, bioRxiv, .
[50]
D. X. Zhang, Q. Hu, X. Z. Liu, Y. B. Ji, H. P. Chao, Y. Liu, A. Tracz, J. Kirk, S. Buonamici, and P. Zhu, et al., Intron retention is a hallmark and spliceosome represents a therapeutic vulnerability in aggressive prostate cancer, Nat. Commun., vol. 11, no. 1, p. 2089, 2020.
[51]
D. Kim, M. Shivakumar, S. Han, M. S. Sinclair, Y. J. Lee, Y. L. Zheng, O. I. Olopade, D. Kim, and Y. Lee, Population-dependent intron retention and DNA methylation in breast cancer, Mol. Cancer Res., vol. 16, no. 3, pp. 461-469, 2018.
[52]
H. D. Li, R. Menon, G. S. Omenn, and Y. F. Guan, The emerging era of genomic data integration for analyzing splice isoform function, Trends Genet., vol. 30, no. 8, pp. 340-347, 2014.
[53]
H. D. Li, C. H. Yang, Z. M. Zhang, M. Y. Yang, F. X. Wu, G. S. Omenn, and J. X. Wang, IsoResolve: Predicting splice isoform functions by integrating gene and isoform-level features with domain adaptation, Bioinformatics, vol. 37, no. 4, pp. 522-530, 2021.
[54]
R. Eksi, H. D. Li, R. Menon, Y. C. Wen, G. S. Omenn, M. Kretzler, and Y. F. Guan, Systematically differentiating functions for alternatively spliced isoforms through integrating RNA-Seq data, PLOS Comput. Biol., vol. 9, no. 11, p. e1003314, 2017.
[55]
Z. Y. Fang, C. X. Lin, Y. P. Xu, H. D. Li, and Q. S. Xu, REBET: A method to determine the number of cell clusters based on batch effect removal, Brief. Bioinform., .
[56]
J. T. Zheng, C. X. Lin, Z. Y. Fang, and H. D. Li, Intron retention as a mode for RNA-Seq data analysis, Front. Genet., vol. 11, p. 586, 2020.
[57]
A. Y. Zhang, S. A. Su, A. P. Ng, A. Z. Holik, M. L. Asselin-Labat, M. E. Ritchie, and C. W. Law, A data-driven approach to characterising intron signal in RNA-Seq data, bioRxiv, .
[58]
H. D. Li, C. C. Funk, K. McFarland, E. B. Dammer, M. Allen, M. M. Carrasquillo, Y. Levites, P. Chakrabarty, J. D. Burgess, and X. Wang, et al., Integrative functional genomic analysis of intron retention in human and mouse brain with Alzheimer’s disease, Alzheimer’s Dementia, vol. 17, no. 6, pp. 984-1004, 2021.
[59]
D. An, H. X. Cao, C. S. Li, K. Humbeck, and W. Q. Wang, Isoform sequencing and state-of-art applications for unravelling complexity of plant transcriptomes, Genes, vol. 9, no. 1, p. 43, 2018.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 22 April 2021
Revised: 09 August 2021
Accepted: 20 August 2021
Published: 27 December 2021
Issue date: March 2022

Copyright

© The author(s) 2022.

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Nos. 61772556, 61972185, U1909208, 61972423, and 61832019), 111 Project (No. B18059), and Hunan Provincial Science and Technology Program (No. 2018WK4001).

The results published here are in part based on data obtained from the AD Knowledge Portal (https://adknowledgeportal.org). Support for these studies was provided by the NIH U01 AG046139. We thank Drs. Jada Lewis, Karen Duff, David Westaway, and David Borchelt for generating these lines of transgenic mice and providing us access to them.

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return