Journal Home > Volume 25 , Issue 5

Many human diseases involve multiple genes in complex interactions. Large Genome-Wide Association Studies (GWASs) have been considered to hold promise for unraveling such interactions. However, statistic tests for high-order epistatic interactions ( 2 Single Nucleotide Polymorphisms (SNPs)) raise enormous computational and analytical challenges. It is well known that the block-wise structure exists in the human genome due to Linkage Disequilibrium (LD) between adjacent SNPs. In this paper, we propose a novel Bayesian method, named BAM, for simultaneously partitioning SNPs into LD-blocks and detecting genome-wide multi-locus epistatic interactions that are associated with multiple diseases. Experimental results on the simulated datasets demonstrate that BAM is powerful and efficient. We also applied BAM on two GWAS datasets from WTCCC, i.e., Rheumatoid Arthritis and Type 1 Diabetes, and accurately recovered the LD-block structure. Therefore, we believe that BAM is suitable and efficient for the full-scale analysis of multi-disease-related interactions in GWASs.


menu
Abstract
Full text
Outline
About this article

BAM: A Block-Based Bayesian Method for Detecting Genome-Wide Associations with Multiple Diseases

Show Author's information Guanying WuXuan Guo( )Baohua Xu( )
Dental Center of China-Japan Friendship Hospital, Beijing 100029, China.
Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, USA.

Abstract

Many human diseases involve multiple genes in complex interactions. Large Genome-Wide Association Studies (GWASs) have been considered to hold promise for unraveling such interactions. However, statistic tests for high-order epistatic interactions ( 2 Single Nucleotide Polymorphisms (SNPs)) raise enormous computational and analytical challenges. It is well known that the block-wise structure exists in the human genome due to Linkage Disequilibrium (LD) between adjacent SNPs. In this paper, we propose a novel Bayesian method, named BAM, for simultaneously partitioning SNPs into LD-blocks and detecting genome-wide multi-locus epistatic interactions that are associated with multiple diseases. Experimental results on the simulated datasets demonstrate that BAM is powerful and efficient. We also applied BAM on two GWAS datasets from WTCCC, i.e., Rheumatoid Arthritis and Type 1 Diabetes, and accurately recovered the LD-block structure. Therefore, we believe that BAM is suitable and efficient for the full-scale analysis of multi-disease-related interactions in GWASs.

Keywords: disease association study, epistasis, Linkage Disequilibrium (LD) block, Bayesian methods

References(44)

[1]
H. Sabaa, Z. Cai, Y. Wang, R. Goebel, S. Moore, and G. Lin, Whole genome identity-by-descent determination, Journal of Bioinformatics and Computational Biology, vol. 11, no. 2, p. 1350002, 2013.
[2]
Y. Wang, Z. Cai, P. Stothard, S. Moore, R. Goebel, L. Wang, and G. Lin, Fast accurate missing SNP genotype local imputation, BMC Research Notes, vol. 5, no. 1, p. 404, 2012.
[3]
Y. He, Z. Zhang, X. Peng, F. Wu, and J. Wang, De novo assembly methods for next generation sequencing data, Tsinghua Science and Technology, vol. 18, no. 5, pp. 500-514, 2013.
[4]
K. Peter and D. J. Hunter, Genetic risk prediction: Are we there yet? The New England Journal of Medicine, vol. 360, no. 17, pp. 1701-1703, 2009.
[5]
M. Nikpay, A. Goel, H. H. Won, L. M. Hall, C. Willenborg, S. Kanoni, D. Saleheen, T. Kyriakou, C. P. Nelson, J. C. Hopewell, et al., A comprehensive 1000 genomes-based genome-wide association meta-analysis of coronary artery disease, Nature Genetics, vol. 47, no. 10, p. 1121, 2015.
[6]
H. Schunkert, I. R. König, S. Kathiresan, M. P. Reilly, T. L. Assimes, H. Holm, M. Preuss, A. F. Stewart, M. Barbalic, C. Gieger, et al., Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease, Nature Genetics, vol. 43, no. 4, p. 333, 2011.
[7]
J. C. Lambert, C. A. Ibrahim-Verbaas, D. Harold, A. C. Naj, R. Sims, C. Bellenguez, G. Jun, A. L. DeStefano, J. C. Bis, G. W. Beecham, et al., Meta-analysis of 74 046 individuals identifies 11 new susceptibility loci for alzheimer’s disease, Nature Genetics, vol. 45, no. 12, p. 1452, 2013.
[8]
C. Sun, Q. Li, L. Cui, H. Li, and Y. Shi, Heterogeneous network-based chronic disease progression mining, Big Data Mining and Analytics, vol. 2, no. 1, pp. 25-34, 2018.
[9]
W. Van Rheenen, A. Shatunov, A. M. Dekker, R. L. McLaughlin, F. P. Diekstra, S. L. Pulit, R. A. van der Spek, U. Võsa, S. de Jong, M. R. Robinson, et al., Genome-wide association analyses identify new risk variants and the genetic architecture of amyotrophic lateral sclerosis, Nature Genetics, vol. 48, no. 9, p. 1043, 2016.
[10]
S. Ripke, N. R. Wray, C. M. Lewis, S. P. Hamilton, M. M. Weissman, G. Breen, E. M. Byrne, D. H. Blackwood, D. I. Boomsma, S. Cichon, et al., A mega-analysis of genome-wide association studies for major depressive disorder, Molecular Psychiatry, vol. 18, no. 4, p. 497, 2013.
[11]
P. Sklar, S. Ripke, L. J. Scott, O. A. Andreassen, S. Cichon, N. Craddock, H. J. Edenberg, J. I. Nurnberger, M. Rietschel, D. Blackwood, et al., Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near ODZ4, Nature Genetics, vol. 44, no. 9, p. 1072, 2012.
[12]
X. Wan, C. Yang, Q. Yang, H. Xue, N. L. Tang, and W. Yu, Detecting two-locus associations allowing for interactions in genome-wide association studies, Bioinformatics, vol. 26, no. 20, pp. 2517-2525, 2010.
[13]
L. S. Yung, C. Yang, X. Wan, and W. Yu, Gboost: A GPU-based tool for detecting gene-gene interactions in genome-wide case control studies, Bioinformatics, vol. 27, no. 9, pp. 1309-1310, 2011.
[14]
Y. Liu, H. Xu, S. Chen, X. Chen, Z. Zhang, Z. Zhu, X. Qin, L. Hu, J. Zhu, G. P. Zhao, et al., Genome-wide interaction-based association analysis identified multiple new susceptibility loci for common diseases, PLoS Genetics, vol. 7, no. 3, p. e1001338, 2011.
[15]
J. Marchini, P. Donnelly, and L. R. Cardon, Genome-wide strategies for detecting multiple loci that influence complex diseases, Nature Genetics, vol. 37, no. 4, p. 413, 2005.
[16]
J. Li, A novel strategy for detecting multiple loci in genome-wide association studies of complex diseases, International Journal of Bioinformatics Research and Applications, vol. 4, no. 2, p. 150, 2008.
[17]
X. Wan, C. Yang, Q. Yang, H. Xue, N. L. Tang, and W. Yu, Predictive rule inference for epistatic interaction detection in genome-wide association studies, Bioinformatics, vol. 26, no. 1, pp. 30-37, 2009.
[18]
B. Liu, S. Feng, X. Guo, and J. Zhang, Bayesian analysis of complex mutations in hbv, hcv, and hiv studies, Big Data Mining and Analytics, vol. 2, no. 3, pp. 145-158, 2019.
[19]
Y. Zhang and J. S. Liu, Bayesian inference of epistatic interactions in case-control studies, Nature Genetics, vol. 39, no. 9, p. 1167, 2007.
[20]
X. Guo, N. Yu, F. Gu, X. Ding, J. Wang, and Y. Pan, Genome-wide interaction-based association of human diseases—a survey, Tsinghua Science and Technology, vol. 19, no. 6, pp. 596-616, 2014.
[21]
P. M. Visscher, N. R. Wray, Q. Zhang, P. Sklar, M. I. McCarthy, M. A. Brown, and J. Yang, 10 years of GWAS discovery: Biology, function, and translation, The American Journal of Human Genetics, vol. 101, no. 1, pp. 5-22, 2017.
[22]
C. Niel, C. Sinoquet, C. Dina, and G. Rocheleau, A survey about methods dedicated to epistasis detection, Frontiers in Genetics, vol. 6, p. 285, 2015.
[23]
Y. J. Wen, H. Zhang, Y. L. Ni, B. Huang, J. Zhang, J. Y. Feng, S. B. Wang, J. M. Dunwell, Y. M. Zhang, and R. Wu, Methodological implementation of mixed linear models in multi-locus genome-wide association studies, Briefings in Bioinformatics, vol. 19, no. 4, pp. 700-712, 2017.
[24]
X. Ding and X. Guo, A survey of SNP data analysis, Big Data Mining and Analytics, vol. 1, no. 3, pp. 173-190, 2018.
[25]
X. Guo, Searching genome-wide disease association through SNP data, PhD dissertation, Georgia State University, Athens, GA, USA, 2015.
[26]
X. Guo, J. Zhang, Z. Cai, D. Z. Du, and Y. Pan, Searching genome-wide multi-locus associations for multiple diseases based on bayesian inference, IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 14, no. 3, pp. 600-610, 2017.
[27]
T. Berisa and J. K. Pickrell, Approximately independent linkage disequilibrium blocks in human populations, Bioinformatics, vol. 32, no. 2, p. 283, 2016.
[28]
S. Gazal, H. K. Finucane, N. A. Furlotte, P. R. Loh, P. F. Palamara, X. Liu, A. Schoech, B. Bulik-Sullivan, B. M. Neale, A. Gusev, et al., Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection, Nature Genetics, vol. 49, no. 10, p. 1421, 2017.
[29]
Y. Cheng, H. Sabaa, Z. Cai, R. Goebel, and G. Lin, Efficient haplotype inference algorithms in one whole genome scan for pedigree data with non-genotyped founders, Acta Mathematicae Applicatae Sinica, English Series, vol. 25, no. 3, pp. 477-488, 2009.
[30]
Z. Liu and S. Lin, Multilocus LD measure and tagging SNP selection with generalized mutual information, Genetic Epidemiology: The Official Publication of the International Genetic Epidemiology Society, vol. 29, no. 4, pp. 353-364, 2005.
[31]
Z. Cai, H. Sabaa, Y. Wang, R. Goebel, Z. Wang, J. Xu, P. Stothard, and G. Lin, Most parsimonious haplotype allele sharing determination, BMC Bioinformatics, vol. 10, no. 1, p. 115, 2009.
[32]
Y. Zhang, J. Zhang, and J. S. Liu, Block-based bayesian epistasis association mapping with application to WTCCC type 1 diabetes data, The Annals of Applied Statistics, vol. 5, no. 3, p. 2052, 2011.
[33]
Wellcome Trust Case Control Consortium, Genome-wide association study of 14 000 cases of seven common diseases and 3000 shared controls, Nature, vol. 447, no. 7145, p. 661, 2007.
[34]
X. Guo, J. Zhang, Z. Cai, D. Z. Du, and Y. Pan, DAM: A bayesian method for detecting genome-wide associations on multiple diseases, in Bioinformatics Research and Applications. New York, NY, USA: Springer, 2015, pp. 96-107.
DOI
[35]
E. T. Bell, Exponential numbers, The American Mathematical Monthly, vol. 41, no. 7, pp. 411-419, 1934.
[36]
S. B. Gabriel, S. F. Schaffner, H. Nguyen, J. M. Moore, J. Roy, B. Blumenstiel, J. Higgins, M. DeFelice, A. Lochner, M. Faggart, et al., The structure of haplotype blocks in the human genome, Science, vol. 296, no. 5576, pp. 2225-2229, 2002.
[37]
J. S. Liu, Monte Carlo Strategies in Scientific Computing, Berlin, Germany: Springer Science & Business Media, 2008.
[38]
G. Casella and E. I. George, Explaining the gibbs sampler, The American Statistician, vol. 46, no. 3, pp.167-174, 1992.
[39]
D. Altshuler and P. Donnelly, A haplotype map of the human genome, Nature, vol. 437, no. 7063, pp. 1299-1320, 2005.
[40]
P. I. de Bakker, R. Yelensky, I. Pe’er, S. B. Gabriel, M. J. Daly, and D. Altshuler, Efficiency and power in genetic association studies, Nature Genetics, vol. 37, no. 11, pp. 1217-1223, 2005.
[41]
X. Wan, C. Yang, Q. Yang, H. Xue, X. Fan, N. L. Tang, and W. Yu, Boost: A fast approach to detecting gene-gene interactions in genome-wide case-control studies, The American Journal of Human Genetics, vol. 87, no. 3, pp. 325-340, 2010.
[42]
J. Marchini, B. Howie, S. Myers, G. McVean, and P. Donnelly, A new multipoint method for genome-wide association studies by imputation of genotypes, Nature Genetics, vol. 39, no. 7, pp. 906-913, 2007.
[43]
X. Guo, Y. Meng, N. Yu, and Y. Pan, Cloud computing for detecting high-order genome-wide epistatic interaction via dynamic clustering, BMC Bioinformatics, vol. 15, no. 1, p. 102, 2014.
[44]
J. K. Pritchard and N. A. Rosenberg, Use of unlinked genetic markers to detect population stratification in association studies, The American Journal of Human Genetics, vol. 65, no. 1, pp. 220-228, 1999.
Publication history
Copyright
Rights and permissions

Publication history

Received: 29 October 2019
Revised: 05 December 2019
Accepted: 02 January 2020
Published: 16 March 2020
Issue date: October 2020

Copyright

© The author(s) 2020

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return