Journal Home > Volume 24 , Issue 4

Inferring Gene Regulatory Networks (GRNs) structure from gene expression data has been a challenging problem in systems biology. It is critical to identify complicated regulatory relationships among genes for understanding regulatory mechanisms in cells. Various methods based on information theory have been developed to infer GRNs. However, these methods introduce many redundant regulatory relationships in the network inference process due to external noise in the original data, topology sparseness in the network structure, and non-linear dependency among genes. Especially as the network size increases, the performance of these methods decreases dramatically. In this paper, a novel network structure inference method named Loc-PCA-CMI is proposed that first identifies local overlapped gene clusters, and then infers the local network structure for each cluster by a Path Consistency Algorithm based on Conditional Mutual Information (PCA-CMI). The final structure of the GRN is denoted as dependence among genes by an ensemble of the obtained local network structures. Loc-PCA-CMI was evaluated on DREAM3 knock-out datasets, and its performance was compared to other information theory-based network inference methods including ARACNE, MRNET, PCA-CMI, and PCA-PMI. Experimental results demonstrate our novel method Loc-PCA-CMI outperforms the other four methods in DREAM3 datasets especially in size 50 and 100 networks.


menu
Abstract
Full text
Outline
About this article

A Novel Method of Gene Regulatory Network Structure Inference from Gene Knock-Out Expression Data

Show Author's information Xiang ChenMin Li( )Ruiqing ZhengSiyu ZhaoFang-Xiang WuYaohang LiJianxin Wang
School of Computer Science and Engineering, Central South University, Changsha 410083, China.
Department of Mechanical Engineering and Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada.
Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA.

Abstract

Inferring Gene Regulatory Networks (GRNs) structure from gene expression data has been a challenging problem in systems biology. It is critical to identify complicated regulatory relationships among genes for understanding regulatory mechanisms in cells. Various methods based on information theory have been developed to infer GRNs. However, these methods introduce many redundant regulatory relationships in the network inference process due to external noise in the original data, topology sparseness in the network structure, and non-linear dependency among genes. Especially as the network size increases, the performance of these methods decreases dramatically. In this paper, a novel network structure inference method named Loc-PCA-CMI is proposed that first identifies local overlapped gene clusters, and then infers the local network structure for each cluster by a Path Consistency Algorithm based on Conditional Mutual Information (PCA-CMI). The final structure of the GRN is denoted as dependence among genes by an ensemble of the obtained local network structures. Loc-PCA-CMI was evaluated on DREAM3 knock-out datasets, and its performance was compared to other information theory-based network inference methods including ARACNE, MRNET, PCA-CMI, and PCA-PMI. Experimental results demonstrate our novel method Loc-PCA-CMI outperforms the other four methods in DREAM3 datasets especially in size 50 and 100 networks.

Keywords: gene regulatory networks, network inference, path consistency algorithm

References(65)

[1]
G. Altay and F. Emmert-Streib, Inferring the conservative causal core of gene regulatory networks, BMC Systems Biology, vol. 4, no. 1, p. 132, 2010.
[2]
K., Basso A. A., Margolin G., Stolovitzky U., Klein R. Dalla-Favera, and A. Califano, Reverse engineering of regulatory networks in human b cells, Nature Genetics, vol. 37, no. 4, p. 382, 2005.
[3]
L., Elnitski V. X., Jin P. J. Farnham, and S. J. Jones, Locating mammalian transcription factor binding sites: A survey of computational and experimental techniques, Genome Research, vol. 16, no. 12, pp. 1455-1464, 2006.
[4]
T. R., Hughes M. J., Marton A. R., Jones C. J., Roberts R., Stoughton C. D., Armour H. A., Bennett E., Coffey H., Dai Y. D., He et al., Functional discovery via a compendium of expression profiles, Cell, vol. 102, no. 1, pp. 109-126, 2000.
[5]
S. R., Maetschke P. B., Madhamshettiwar M. J. Davis, and M. A. Ragan, Supervised, semi-supervised and unsupervised inference of gene regulatory networks, Briefings in Bioinformatics, vol. 15, no. 2, pp. 195-211, 2013.
[6]
A. A., Margolin K., Wang W. K., Lim M., Kustagi I. Nemenman, and A. Califano, Reverse engineering cellular networks, Nature Protocols, vol. 1, no. 2, p. 662, 2006.
[7]
V. A., Huynh-Thu A., Irrthum L. Wehenkel, and P. Geurts, Inferring regulatory networks from expression data using tree-based methods, PLoS One, vol. 5, no. 9, pp. 1-10, 2010.
[8]
A.-C., Haury F., Mordelet P. Vera-Licona, and J.-P. Vert, TIGRESS: Trustful Inference of Gene REgulation using Stability Selection, BMC Syst. Biol., vol. 6, no. 1, p. 145, 2012.
[9]
V. A., Huynh-Thu G., Sanguinetti A. Huynh-thu, and T. Jump, Combining tree-based and dynamical systems for the inference of gene regulatory networks, Bioinformatics, vol. 31, no. 10, pp. 1614-1622, 2014.
[10]
L.-Z., Liu F.-X. Wu, and W.-J. Zhang, A group lasso-based method for robustly inferring gene regulatory networks from multiple time-course datasets, BMC Systems Biology, vol. 8, no. S3, p. S1, 2014.
[11]
M., Li R., Zheng Y., Li F.-X. Wu, and J. Wang, Mgt-sm: A method for constructing cellular signal transduction networks, IEEE/ACM Transactions on Computational Biology and Bioinformatics, .
[12]
R., Zheng M., Li X., Chen F.-X., Wu Y. Pan, and J. Wang, Bixgboost: A scalable, flexible boosting-based method for reconstructing gene regulatory networks, Bioinformatics, .
[13]
E. Sakamoto and H. Iba, Inferring a system of differential equations for a gene regulatory network by using genetic programming, in Proceedings of the 2001 Congress on Evolutionary Computation, 2001, vol. 1, pp. 720-726.
[14]
A. R., Chowdhury M. Chetty, and R. Evans, Stochastic s-system modeling of gene regulatory network, Cognitive Neurodynamics, vol. 9, no. 5, pp. 535-547, 2015.
[15]
Z., Li P., Li A. Krishnan, and J. Liu, Large-scale dynamic gene regulatory network inference combining differential equation models with local dynamic bayesian network analysis, Bioinformatics, vol. 27, no. 19, pp. 2686-2691, 2011.
[16]
K. Murphy and S. Mian, Modelling gene expression data using dynamic Bayesian networks, Technical report, Computer Science Division, University of California, Berkeley, CA, USA, 1999.
[17]
M. Zou and S. D. Conzen, A new Dynamic Bayesian Network (DBN) approach for identifying gene regulatory networks from time course microarray data, Bioinformatics, vol. 21, no. 1, pp. 71-79, 2004.
[18]
N. X., Vinh M., Chetty R. Coppel, and P. P. Wangikar, Globalmit: Learning globally optimal dynamic Bayesian network with the mutual information test criterion, Bioinformatics, vol. 27, no. 19, pp. 2765-2766, 2011.
[19]
W. C., Young A. E. Raftery, and K. Y. Yeung, Fast Bayesian inference for gene regulatory networks using scanbma, BMC Systems Biology, vol. 8, no. 1, p. 47, 2014.
[20]
F., Liu S.-W., Zhang W.-F., Guo Z.-G. Wei, and L. Chen, Inference of gene regulatory network based on local Bayesian networks, PLOS Comput. Biol., vol. 12, no. 8, p. e1005024, 2016.
[21]
N., Omranian J. M., Eloundou-Mbebi B. Mueller-Roeber, and Z. Nikoloski, Gene regulatory network inference using fused lasso on multiple data sets, Scientific Reports, vol. 6, p. 20533, 2016.
[22]
F.-X., Wu W.-J. Zhang, and A. J. Kusalik, Modeling gene expression from microarray expression data with state-space equations, in Biocomputing 2004. World Scientific, 2003, pp. 581-592.
DOI
[23]
M., Quach N. Brunel, and F. d’Alché Buc, Estimating parameters and hidden variables in non-linear state-space models based on odes for biological networks inference, Bioinformatics, vol. 23, no. 23, pp. 3209-3216, 2007.
[24]
Y., Wang T., Joshi X.-S., Zhang D. Xu, and L. Chen, Inferring gene regulatory networks from multiple microarray datasets, Bioinformatics, vol. 22, no. 19, pp. 2413-2420, 2006.
[25]
V. A., Huynh-Thu A., Irrthum L. Wehenkel, and P. Geurts, Inferring regulatory networks from expression data using tree-based methods, PloS One, vol. 5, no. 9, p. e12776, 2010.
[26]
W. J., Longabaugh E. H. Davidson, and H. Bolouri, Computational representation of developmental genetic regulatory networks, Developmental Biology, vol. 283, no. 1, pp. 1-16, 2005.
[27]
G. Karlebach and R. Shamir, Modelling and analysis of gene regulatory networks, Nature Reviews—Molecular Cell Biology, vol. 9, no. 10, p. 770, 2008.
[28]
I., Shmulevich E. R., Dougherty S. Kim, and W. Zhang, Probabilistic boolean networks: A rule-based uncertainty model for gene regulatory networks, Bioinformatics, vol. 18, no. 2, pp. 261-274, 2002.
[29]
H., Kim J. K. Lee, and T. Park, Boolean networks using the chi-square test for inferring large-scale gene regulatory networks, BMC Bioinformatics, vol. 8, no. 1, p. 37, 2007.
[30]
S. Bornholdt, Boolean network models of cellular regulation: Prospects and limitations, Journal of the Royal Society Interface, vol. 5, no. Suppl 1, pp. S85-S94, 2008.
[31]
J. X., Zhou A., Samal A. F., d’Hérouël N. D. Price, and S. Huang, Relative stability of network states in boolean network models of gene regulation in development, Biosystems, vol. 142, pp. 15-24, 2016.
[32]
S. Y., Kim S. Imoto, and S. Miyano, Inferring gene networks from time series microarray data using dynamic bayesian networks, Briefings in Bioinformatics, vol. 4, no. 3, pp. 228-235, 2003.
[33]
X.-W., Chen G. Anantha, and X. Wang, An effective structure learning method for constructing gene networks, Bioinformatics, vol. 22, no. 11, pp. 1367-1374, 2006.
[34]
C. J., Needham J. R., Bradford A. J. Bulpitt, and D. R. Westhead, A primer on learning in Bayesian networks for computational biology, PLoS Computational Biology, vol. 3, no. 8, p. e129, 2007.
[35]
L.-Y., Lo M.-L., Wong K.-H. Lee, and K.-S. Leung, High-order dynamic Bayesian network learning with hidden common causes for causal gene regulatory network, BMC Bioinformatics, vol. 16, no. 1, p. 395, 2015.
[36]
T. S., Gardner D. Di, Bernardo D. Lorenz, and J. J. Collins, Inferring genetic networks and identifying compound mode of action via expression profiling, Science, vol. 301, no. 5629, pp. 102-105, 2003.
[37]
D. di, Bernardo M. J., Thompson T. S., Gardner S. E., Chobot E. L., Eastwood A. P., Wojtovich S. J., Elliott S. E. Schaus, and J. J. Collins, Chemogenomic profiling on a genome-wide scale using reverse-engineered gene networks, Nature Biotechnology, vol. 23, no. 3, pp. 377-383, 2005.
[38]
M., Bansal G. D. Gatta, and D. Di Bernardo, Inference of gene regulatory networks and compound mode of action from time course gene expression profiles, Bioinformatics, vol. 22, no. 7, pp. 815-822, 2006.
[39]
A., Honkela C., Girardot E. H., Gustafson Y.-H., Liu E. E., Furlong N. D. Lawrence, and M. Rattray, Model-based method for transcription factor target identification with limited data, Proceedings of the National Academy of Sciences, vol. 107, no. 17, pp. 7793-7798, 2010.
[40]
T., Lu H., Liang H. Li, and H. Wu, High-dimensional odes coupled with mixed-effects modeling techniques for dynamic gene regulatory network identification, Journal of the American Statistical Association, vol. 106, no. 496, pp. 1242-1258, 2011.
[41]
W.-P. Lee and W.-S. Tzou, Computational methods for discovering gene networks from expression data, Briefings in Bioinformatics, vol. 10, no. 4, pp. 408-423, 2009.
[42]
D. M., Chickering D. Heckerman, and C. Meek, Large-sample learning of Bayesian networks is np-hard, Journal of Machine Learning Research, vol. 5, pp. 1287-1330, 2004.
[43]
M., Hecker S., Lambeck S., Toepfer E. Van Someren, and R. Guthke, Gene regulatory network inference: Data integration in dynamic models—A review, Biosystems, vol. 96, no. 1, pp. 86-103, 2009.
[44]
D., Marbach J. C., Costello R., Küffner N. M., Vega R. J., Prill D. M., Camacho K. R., Allison M., Kellis J. J., Collins G., Stolovitzky et al., Wisdom of crowds for robust gene network inference, Nature Methods, vol. 9, no. 8, pp. 796-804, 2012.
[45]
F.-X. Wu, Inference of gene regulatory networks and its validation, Current Bioinformatics, vol. 2, no. 2, pp. 139-144, 2007.
[46]
L.-Z., Liu F.-X. Wu, and W.-J. Zhang, Reverse engineering of gene regulatory networks from biological data, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 2, no. 5, pp. 365-385, 2012.
[47]
M., Li H., Gao J. Wang, and F.-X. Wu, Control principles for complex biological networksli control principles for biological networks, Briefings in Bioinformatics, .
[48]
Y. R. Wang and H. Huang, Review on statistical methods for gene network reconstruction using expression data, Journal of Theoretical Biology, vol. 362, pp. 53-61, 2014.
[49]
J., Ruyssinck P., Geurts T., Dhaene P. Demeester, and Y. Saeys, Nimefi: Gene regulatory network inference using multiple ensemble feature importance algorithms, PLoS One, vol. 9, no. 3, p. e92709, 2014.
[50]
H., Brunel J.-J., Gallardo-Chacón A., Buil M., Vallverdú J. M., Soria P. Caminal, and A. Perera, Miss: A non-linear methodology based on mutual information for genetic association studies in both population and sib-pairs analysis, Bioinformatics, vol. 26, no. 15, pp. 1811-1818, 2010.
[51]
X., Zhang X.-M., Zhao K., He L., Lu Y., Cao J., Liu J.-K., Hao Z.-P. Liu, and L. Chen, Inferring gene regulatory networks from gene expression data by path consistency algorithm based on conditional mutual information, Bioinformatics, vol. 28, no. 1, pp. 98-104, 2011.
[52]
D., Marbach R. J., Prill T., Schaffter C., Mattiussi D. Floreano, and G. Stolovitzky, Revealing strengths and weaknesses of methods for gene network inference, Proceedings of the National Academy of Sciences, vol. 107, no. 14, pp. 6286-6291, 2010.
[53]
A. A., Margolin I., Nemenman K., Basso C., Wiggins G., Stolovitzky R. Dalla Favera, and A. Califano, Aracne: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context, BMC Bioinformatics, vol. 7, no. 1, p. S7, 2006.
[54]
P. E., Meyer K., Kontos F. Lafitte, and G. Bontempi, Information-theoretic inference of large transcriptional regulatory networks, EURASIP Journal on Bioinformatics and Systems Biology, vol. 2007, no. 1, p. 79879, 2007.
[55]
H., Peng F. Long, and C. Ding, Feature selection based on mutual information criteria of max-dependency, max- relevance, and min-redundancy, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1226-1238, 2005.
[56]
J., Zhao Y., Zhou X. Zhang, and L. Chen, Part mutual information for quantifying direct associations in networks, Proceedings of the National Academy of Sciences, vol. 113, no. 18, pp. 5130-5135, 2016.
[57]
H., Jeong B., Tombor R., Albert Z. N. Oltvai, and A.-L. Barabási, The large-scale organization of metabolic networks, Nature, vol. 407, no. 6804, pp. 651-654, 2000.
[58]
P., Spirtes C. N. Glymour, and R. Scheines, Causation, Prediction, and Search. MIT Press, 2000.
DOI
[59]
T., Schaffter D. Marbach, and D. Floreano, GeneNetWeaver: In silico benchmark generation and performance profiling of network inference methods, Bioinformatics, vol. 27, no. 16, pp. 2263-2270, 2011.
[60]
T. Saito and M. Rehmsmeier, The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets, PloS One, vol. 10, no. 3, p. e0118432, 2015.
[61]
P. E., Meyer F. Lafitte, and G. Bontempi, minet: A R/Bioconductor package for inferring large transcriptional networks using mutual information, BMC Bioinformatics, vol. 9, no. 1, p. 461, 2008.
[62]
C., Olsen P. E. Meyer, and G. Bontempi, On the impact of entropy estimation on transcriptional regulatory network inference based on mutual information, EURASIP Journal on Bioinformatics and Systems Biology, vol. 2009, no. 1, p. 308959, 2008.
[63]
P., Meyer D., Marbach S. Roy, and M. Kellis, Information-theoretic inference of gene networks using backward elimination, in BioComp, 2010, pp. 700-705.
[64]
M., Li X., Meng R., Zheng F.-X., Wu Y., Li Y. Pan, and J. Wang, Identification of protein complexes by using a spatial and temporal active protein interaction network, IEEE/ACM Transactions on Computational Biology and Bioinformatics, .
[65]
M., Li J., Yang F.-X., Wu Y. Pan, and J. Wang, Dynetviewer: A cytoscape app for dynamic network construction, analysis and visualization, Bioinformatics, vol. 34, no. 9, pp. 1597-1599, 2017.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 05 April 2018
Accepted: 01 May 2018
Published: 07 March 2019
Issue date: August 2019

Copyright

© The author(s) 2019

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (Nos. 61622213 and 61732009), the 111 Project (No. B18059), and the Hunan Provincial Science and Technology Program (No. 2018WK4001).

Rights and permissions

Return