Journal Home > Volume 3 , Issue 4

Circular RNA (circRNA) is a novel non-coding endogenous RNAs. Evidence has shown that circRNAs are related to many biological processes and play essential roles in different biological functions. Although increasing numbers of circRNAs are discovered using high-throughput sequencing technologies, these techniques are still time-consuming and costly. In this study, we propose a computational method to predict circRNA-disesae associations which is based on metapath2vec++ and matrix factorization with integrated multiple data (called PCD_MVMF). To construct more reliable networks, various aspects are considered. Firstly, circRNA annotation, sequence, and functional similarity networks are established, and disease-related genes and semantics are adopted to construct disease functional and semantic similarity networks. Secondly, metapath2vec++ is applied on an integrated heterogeneous network to learn the embedded features and initial prediction score. Finally, we use matrix factorization, take similarity as a constraint, and optimize it to obtain the final prediction results. Leave-one-out cross-validation, five-fold cross-validation, and f-measure are adopted to evaluate the performance of PCD_MVMF. These evaluation metrics verify that PCD_MVMF has better prediction performance than other methods. To further illustrate the performance of PCD_MVMF, case studies of common diseases are conducted. Therefore, PCD_MVMF can be regarded as a reliable and useful circRNA-disease association prediction tool.


menu
Abstract
Full text
Outline
About this article

CircRNA-Disease Associations Prediction Based on Metapath2vec++ and Matrix Factorization

Show Author's information Yuchen ZhangXiujuan Lei( )Zengqiang FangYi Pan( )
School of Computer Science, Shaanxi Normal University, Xi’an 710119, China
Department of Computer Science, Georgia State University, Atlanta, GA 30302, USA

Abstract

Circular RNA (circRNA) is a novel non-coding endogenous RNAs. Evidence has shown that circRNAs are related to many biological processes and play essential roles in different biological functions. Although increasing numbers of circRNAs are discovered using high-throughput sequencing technologies, these techniques are still time-consuming and costly. In this study, we propose a computational method to predict circRNA-disesae associations which is based on metapath2vec++ and matrix factorization with integrated multiple data (called PCD_MVMF). To construct more reliable networks, various aspects are considered. Firstly, circRNA annotation, sequence, and functional similarity networks are established, and disease-related genes and semantics are adopted to construct disease functional and semantic similarity networks. Secondly, metapath2vec++ is applied on an integrated heterogeneous network to learn the embedded features and initial prediction score. Finally, we use matrix factorization, take similarity as a constraint, and optimize it to obtain the final prediction results. Leave-one-out cross-validation, five-fold cross-validation, and f-measure are adopted to evaluate the performance of PCD_MVMF. These evaluation metrics verify that PCD_MVMF has better prediction performance than other methods. To further illustrate the performance of PCD_MVMF, case studies of common diseases are conducted. Therefore, PCD_MVMF can be regarded as a reliable and useful circRNA-disease association prediction tool.

Keywords: matrix factorization, circular RNAs (circRNAs), circRNA-disease associations, matepath2vec++

References(66)

[1]
S. Qu, X. Yang, X. Li, J. Wang, Y. Gao, R. Shang, W. Sun, K. Dou, and H. Li, Circular RNA: A new star of noncoding RNAs, Cancer Letters, vol. 365, no. 2, pp. 141-148, 2015.
[2]
J. Salzman, C. Gawad, P. Wang, N. Lacayo, and P. O. Brown, Circular RNAs are the predominant transcript isoform from hundreds of human genes in diverse cell types, PLoS One, vol. 7, no. 2, p. e30733, 2012.
[3]
L. Chen, and L. Yang, Regulation of circRNA biogenesis, RNA Biology, vol. 12, no. 4, pp. 381-388, 2015.
[4]
H. L. Sanger, G. Klotz, D. Riesner, H. J. Gross, and A. K. Kleinschmidt, Viroids are single-stranded covalently closed circular RNA molecules existing as highly base-paired rod-like structures, Proceedings of the National Academy of Sciences of the United States of America, vol. 73, no. 11, pp. 3852-3856, 1976.
[5]
C. Cocquerelle, P. Daubersies, M. A. Majerus, J. P. Kerckaert, and B. Bailleul, Splicing with inverted order of exons occurs proximal to large introns, EMBO Journal, vol. 11, no. 3, pp. 1095-1098, 1992.
[6]
F. A. Saad, L. Vitiello, L. Merlini, M. L. Mostacciuolo, S. Oliviero, and G. A. Danieli, A 3’ consensus splice mutation in the human dystrophin gene detected by a screening for intra-exonic deletions, Human Molecular Genetics, vol. 1, no. 5, pp. 345-346, 1992.
[7]
Y. Zhang, X. Zhang, T. Chen, J. F. Xiang, Q. F. Yin, Y. H. Xing, S. Zhu, L. Yang, and L. L. Chen, Circular intronic long noncoding RNAs, Molecular Cell, vol. 51, no. 6, pp. 792-806, 2013.
[8]
Z. Li, C. Huang, C. Bao, L. Chen, M. Lin, X. Wang, G. Zhong, B. Yu, W. Hu, L. Dai, et al., Exon-intron circular RNAs regulate transcription in the nucleus, Nature Structure and Molecular Biology, vol. 22, no. 3, pp. 256-264, 2015.
[9]
J. E. Wilusz and P. A. Sharp, Molecular biology. A circuitous route to noncoding RNA, Science, vol. 340, no. 6131, pp. 440-441, 2013.
[10]
E. Lasda and R. Parker, Circular RNAs: Diversity of form and function, RNA, vol. 20, no. 12, pp. 1829-1842, 2014.
[11]
Y. Gao, J. Wang, and F. Zhao, CIRI: An efficient and unbiased algorithm for de novo circular RNA identification, Genome Biology, vol. 16, p. 4, 2015.
[12]
T. B. Hansen, T. I. Jensen, B. H. Clausen, J. B. Bramsen, B. Finsen, C. K. Damgaard, and J. Kjems, Natural RNA circles function as efficient microRNA sponges, Nature, vol. 495, no. 7441, pp. 384-388, 2013.
[13]
Q. Zheng, C. Bao, W. Guo, S. Li, J. Chen, B. Chen, Y. Luo, D. Lyu, Y. Li, G. Shi, et al., Circular RNA profiling reveals an abundant circHIPK3 that regulates cell growth by sponging multiple miRNAs, Nature Communications, vol. 7, p. 11215, 2016.
[14]
K. Wang, B. Long, F. Liu, J. X. Wang, C. Y. Liu, B. Zhao, L. Y. Zhou, T. Sun, M. Wang, T. Yu, et al., A circular RNA protects the heart from pathological hypertrophy and heart failure by targeting miR-223, European Heart Joutnal, vol. 37, no. 33, pp. 2602-2611, 2016.
[15]
M. Armakola, M. J. Higgins, M. D. Figley, S. J. Barmada, E. A. Scarborough, Z. Diaz, X. Fang, J. Shorter, N. J. Krogan, S. Finkbeiner, et al., Inhibition of RNA lariat debranching enzyme suppresses TDP-43 toxicity in ALS disease models, Nature Genetics, vol. 44, no. 12, pp. 1302-1309, 2012.
[16]
C. Ragan, G. J. Goodall, and N. E. Shirokikh, Insights into the biogenesis and potential functions of exonic circular RNA, Scientific reports, vol. 9, no. 1, p. 2048, 2019.
[17]
I. Legnini, G. Di Timoteo, F. Rossi, M. Morlando, F. Briganti, O. Sthandier, A. Fatica, T. Santini, A. Andronache, M. Wade, et al., Circ-ZNF609 is a circular RNA that can be translated and functions in myogenesis, Molecular Cell, vol. 66, no. 1, pp. 22-37, 2017.
[18]
N. R. Pamudurti, O. Bartok, M. Jens, R. Ashwal-Fluss, C. Stottmeister, L. Ruhe, M. Hanan, E. Wyler, D. Perez-Hernandez, E. Ramberger, et al., Translation of CircRNAs, Molecular Cell, vol. 66, no. 1, pp. 9-21, 2017.
[19]
J. Greene, A. M. Baird, L. Brady, M. Lim, S. G. Gray, R. McDermott, and S. P. Finn, Circular RNAs: Biogenesis, function and role in human diseases, Frontiers in Molecular Biosciences, vol. 4, p. 38, 2017.
[20]
A. Rybak-Wolf, C. Stottmeister, P. Glazar, M. Jens, N. Pino, S. Giusti, M. Hanan, M. Behm, O. Bartok, R. Ashwal-Fluss, et al., Circular RNAs in the mammalian brain are highly abundant, conserved, and dynamically expressed, Molecular Cell, vol. 58, no. 5, pp. 870-885, 2015.
[21]
D. Barbagallo, A. Condorelli, M. Ragusa, L. Salito, M. Sammito, B. Banelli, R. Caltabiano, G. Barbagallo, A. Zappala, R. Battaglia, et al., Dysregulated miR-671-5p/CDR1-AS/CDR1/VSNL1 axis is involved in glioblastoma multiforme, Oncotarget, vol. 7, no. 4, pp. 4746-4759, 2016.
[22]
J. Yao, S. Zhao, Q. Liu, M. Lv, D. Zhou, Z. Liao, and K. Nan, Over-expression of CircRNA_100876 in non-small cell lung cancer and its prognostic value, Pathology Research and Practice, vol. 213, no. 5, pp. 453-456, 2017.
[23]
X. Zhu, X. Wang, S. Wei, Y. Chen, Y. Chen, X. Fan, S. Han, and G. Wu, hsa_circ_0013958: A circular RNA and potential novel biomarker for lung adenocarcinoma, FEBS Journal, vol. 284, no. 14, pp. 2170-2182, 2017.
[24]
W. Sui, Z. Shi, W. Xue, M. Ou, Y. Zhu, J. Chen, H. Lin, F. Liu, and Y. Dai, Circular RNA and gene expression profiles in gastric cancer based on microarray chip technology, Oncology Reports, vol. 37, no. 3, pp. 1804-1814, 2017.
[25]
J. Chen, Y. Li, Q. Zheng, C. Bao, J. He, B. Chen, D. Lyu, B. Zheng, Y. Xu, Z. Long, et al., Circular RNA profile identifies circPVT1 as a proliferative factor and prognostic marker in gastric cancer, Cancer Letters, vol. 388, pp. 208-219, 2017.
[26]
P. Li, H. Chen, S. Chen, X. Mo, T. Li, B. Xiao, R. Yu, and J. Guo, Circular RNA 0000096 affects cell growth and migration in gastric cancer, British Journal of Cancer, vol. 116, no. 5, pp. 626-633, 2017.
[27]
H. Xie, X. Ren, S. Xin, X. Lan, G. Lu, Y. Lin, S. Yang, Z. Zeng, W. Liao, Y. Q. Ding, et al., Emerging roles of circRNA_001569 targeting miR-145 in the proliferation and invasion of colorectal cancer, Oncotarget, vol. 7, no. 18, pp. 26 680-26 691, 2016.
[28]
P. Glazar, P. Papavasileiou, and N. Rajewsky, circBase: A database for circular RNAs, RNA, vol. 20, no. 11, pp. 1666-1670, 2014.
[29]
X. Chen, P. Han, T. Zhou, X. Guo, X. Song, and Y. Li, circRNADb: A comprehensive database for human circular RNAs with protein-coding annotations, Scientific Reports, vol. 6, p. 34985, 2016.
[30]
S. Li, Y. Li, B. Chen, J. Zhao, S. Yu, Y. Tang, Q. Zheng, Y. Li, P. Wang, X. He, et al., exoRBase: A database of circRNA, lncRNA and mRNA in human blood exosomes, Nucleic Acids Research, vol. 46, no. D1, pp. D106-D112, 2018.
[31]
Y. Liu, J. Li, C. Sun, E. Andrews, R. Chao, F. Lin, S. Weng, S. D. Hsu, C. Huang, C. Cheng, et al., circNet: A database of circular RNAs derived from transcriptome sequencing data, Nucleic Acids Research, vol. 44, no. D1, pp. D209-D215, 2016.
[32]
S. Ghosal, S. Das, R. Sen, P. Basak, and J. Chakrabarti, circ2Traits: A comprehensive database for circular RNA potentially associated with disease and traits, Frontiers in Genetics, vol. 4, p. 283, 2013.
[33]
C. Fan, X. Lei, Z. Fang, Q. Jiang, and F. X. Wu, circR2Disease: A manually curated database for experimentally supported circular RNAs associated with various diseases, .
DOI
[34]
Z. Zhao, K. Wang, F. Wu, W. Wang, K. Zhang, H. Hu, Y. Liu, and T. Jiang, circRNA disease: A manually curated database of experimentally supported circRNA-disease associations, Cell Death and Disease, vol. 9, no. 5, p. 475, 2018.
[35]
D. Yao, L. Zhang, M. Zheng, X. Sun, and Y. Lu, circ2Disease: A manually curated database of experimentally validated circRNAs in human disease, Scientific reports, vol. 8, no. 1, p. 11018, 2018.
[36]
Q. Xiao, J. Luo, and J. Dai, Computational prediction of human disease-associated circRNAs based on manifold regularization learning framework, IEEE Journal of Biomedical and Health Informatics, vol. 23, no. 6, pp. 2661- 2669, 2019.
[37]
C. Yan, J. Wang, and F.-X. Wu, DWNN-RLS: Regularized least squares method for predicting circRNA-disease associations, BMC Bioinformatics, vol. 19, no. S19, p. 520, 2018.
[38]
H. Wei and B. Liu, iCircDA-MF: Identification of circRNA-disease associations based on matrix factorization, Briefings in Bioinformatics, vol. 21, no. 4, pp. 1356-1367, 2020.
[39]
L. Wang, Z.-H. You, Y.-M. Li, K. Zheng, and Y.-A. Huang, GCNCDA: A new method for predicting circRNA-disease associations based on graph convolutional network algorithm, PLoS Computational Biology, vol. 16, no. 5, p. e1007568, 2020.
[40]
T. B. Mudiyanselage, X. Lei, N. Senanayake, Y. Zhang, and Y. Pan, Graph convolution networks using message passing and multi-dource dimilarity features for predicting circRNA-disease association, arXiv preprint arXiv: 2009.07173, 2020.
DOI
[41]
X. Lei and Z. Fang, GBDTCDA: Predicting circRNA-disease associations based on gradient boosting decision tree with multiple biological data fusion, International Journal of Biological Sciences, vol. 15, no. 13, pp. 2911-2924, 2019.
[42]
X. Lei, Z. Fang, and L. Guo, Predicting circRNA-disease associations based on improved collaboration filtering recommendation system with multiple data, Frontiers in Genetics, vol. 10, p. 897, 2019.
[43]
C. Fan, X. Lei, and Y. Pan, Prioritizing circRNA-disease associations with convolutional neural network based on multiple similarity feature fusion, Frontiers in Genetics, vol. 11, p. 1042, 2020.
[44]
Y. Dong, N. V. Chawla, and A. Swami, metapath2vec: Scalable representation learning for heterogeneous networks, in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, Canada, 2017, pp. 135-144.
DOI
[45]
T. S. K. Prasad, R. Goel, K. Kandasamy, S. Keerthikumar, S. Kumar, S. Mathivanan, D. Telikicherla, R. Raju, B. Shafreen, A. Venugopal, et al., Human protein reference database—2009 update, Nucleic Acids Research, vol. 37, no. Database issue, pp. D767-D772, 2009.
[46]
D. Lin, An Information-theoretic definition of similarity, in Proceedings of the Fifteenth International Conference on Machine Learning, Madison, WI, USA, 1998, pp. 296-304.
[47]
P. J. Cock, T. Antao, J. T. Chang, B. A. Chapman, C. J. Cox, A. Dalke, I. Friedberg, T. Hamelryck, F. Kauff, B. Wilczynski, et al., Biopython: Freely available python tools for computational molecular biology and bioinformatics, Bioinformatics, vol. 25, no. 11, pp. 1422-1423, 2009.
[48]
W. A. Kibbe, C. Arze, V. Felix, E. Mitraka, E. Bolton, G. Fu, C. J. Mungall, J. X. Binder, J. Malone, D. Vasant, et al., Disease ontology 2015 update: An expanded and updated database of human diseases for linking biomedical knowledge through disease data, Nucleic Acids Research, vol. 43, no. Database issue, pp. D1071-D1078, 2015.
[49]
G. Yu, L.-G. Wang, G.-R. Yan, and Q.-Y. He, DOSE: An R/Bioconductor package for disease ontology semantic and enrichment analysis, Bioinformatics, vol. 31, no. 4, pp. 608-609, 2015.
[50]
J. Piñero, À. Bravo, N. Queralt-Rosinach, A. Gutiérrez-Sacristán, J. Deu-Pons, E. Centeno, J. García-García, F. Sanz, and L. I. Furlong, DisGeNET: A comprehensive platform integrating information on human disease-associated genes and variants, Nucleic Acids Research, vol. 45, no. D1, pp. D833-D839, 2017.
[51]
B. Perozzi, R. Al-Rfou, and S. Skiena, Deepwalk: Online learning of social representations, in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 2014, pp. 701-710.
DOI
[52]
A. Grover and J. Leskovec, node2vec: Scalable feature learning for networks, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 2016, pp. 855-864.
DOI
[53]
T. Mikolov, K. Chen, G. Corrado, and J. Dean, Efficient estimation of word representations in vector space, arXiv preprint arXiv:1301.3781, 2013.
[54]
J. Tang, M. Qu, and Q. Mei, PTE: Predictive text embedding through large-scale heterogeneous text networks, in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 2015, pp. 1165-1174.
DOI
[55]
F. Facchinei, C. Kanzow, and S. Sagratella, Solving quasi-variational inequalities via their KKT conditions, Mathematical Programming, vol. 144, nos. 1&2, pp. 369-412, 2014.
[56]
C. Fan, X. Lei, and F.-X. Wu, Prediction of circRNA-disease associations using KATZ model based on heterogeneous networks, International Journal of Biological Sciences, vol. 14, no. 14, pp. 1950-1959, 2018.
[57]
M. Xie, T. Hwang, and R. Kuang, Prioritizing disease genes by bi-random walk, in Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining, Kuala Lumpur, Malaysia, 2012, pp. 292-303.
DOI
[58]
M. Li, M. Liu, Y. Bin, and J. Xia, Prediction of circRNA-disease associations based on inductive matrix completion, BMC Medical Genomics, vol. 13, no. Suppl 5, p. 42, 2020.
[59]
G. Li, Y. Yue, C. Liang, Q. Xiao, P. Ding, and J. Luo, NCPCDA: Network consistency projection for circRNA-disease association prediction, RSC Advances, vol. 9, no. 57, pp. 33 222-33 228, 2019.
[60]
J. Guo, J. Li, C. Zhu, W. Feng, J. Shao, L. Wan, M. Huang, and J. He, Comprehensive profile of differentially expressed circular RNAs reveals that hsa_circ_0000069 is upregulated and promotes cell proliferation, migration, and invasion in colorectal cancer, OncoTargets and Therapy, vol. 9, pp. 7451-7458, 2016.
[61]
J. Wang, X. Li, L. Lu, L. He, H. Hu, and Z. Xu, Circular RNA hsa_circ_0000567 can be used as a promising diagnostic biomarker for human colorectal cancer, Journal of Clinical Laboratory Analysis, vol. 32, no. 5, p. e22379, 2018.
[62]
W. Xiong, Y. Ai, Y. Li, Q. Ye, Z. Chen, J. Qin, Q. Liu, H. Wang, Y. Ju, W. Li, ,et al., Microarray analysis of circular RNA expression profile associated with 5-fluorouracil-based chemoradiation resistance in colorectal cancer cells, Biomed Research International, vol. 2017, p. 8421614, 2017.
[63]
Y. Luo, X. Zhu, K. Huang, Q. Zhang, Y. Fan, P. Yan, and J. Wen, Emerging roles of circular RNA hsa_circ_0000064 in the proliferation and metastasis of lung cancer, Biomed Pharmacother, vol. 96, pp. 892-898, 2017.
[64]
W. Huang, Y. Yang, J. Wu, Y. Niu, Y. Yao, J. Zhang, X. Huang, S. Liang, R. Chen, S. Chen, et al., Circular RNA cESRP1 sensitises small cell lung cancer cells to chemotherapy by sponging miR-93-5p to inhibit TGF-β signalling, Cell Death Differ, vol. 27, no. 5, pp. 1709-1727, 2020.
[65]
D. Chen, W. Ma, Z. Ke, and F. Xie, circRNA hsa_circ_100395 regulates miR-1228/TCF21 pathway to inhibit lung cancer progression, Cell Cycle, vol. 17, no. 16, pp. 2080-2090, 2018.
[66]
X. Meng, D. Hu, P. Zhang, Q. Chen, and M. Chen, circFunBase: A database for functional circular RNAs, .
DOI
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 02 June 2020
Revised: 09 October 2020
Accepted: 10 October 2020
Published: 16 November 2020
Issue date: December 2020

Copyright

© The authors 2020

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Nos. 61972451, 61672334, and 61902230) and the Fundamental Research Funds for the Central Universities, Shaanxi Normal University (Nos.GK201901010 and 2018TS079).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return