AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (1.1 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

A Survey of SNP Data Analysis

School of Computer Science and Engineering, Yulin Normal University, Yulin 537000, and School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China.
Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203-5017, USA.
Show Author Information

Abstract

Every person differs from every other person regarding their physical appearance, susceptibility to disease, response to medications, and so on. However, 99.9 percent of human DNA is the same. As such, differences in human genomes are very worthy of study. Single-Nucleotide Polymorphisms (SNPs) are the simplest form and most common source of genetic polymorphism. SNPs have been used to successfully identify defective genes that cause Mendelian diseases. However, most common human diseases are complex and are caused by multiple SNPs. Each SNP explains only a small fraction of genetic causes. Experiments on individual SNPs may reveal their non-detectable effects on complex diseases. Pathogenesis is a complicated topic, and it is difficult to correctly predict multiple SNPs. As such, the analysis of SNP data is a critical task in the study of genetic diseases. In this paper, we divide the methods for genome-wide SNP data analysis into two categories: single-trait Genome-Wide Association Studies (GWAS) in which pathology is mined from data of a single phenotype, and multiple-trait GWAS which identifies cross-phenotype associations. For single-trait GWAS, we review methods ranging from the simple to the complex, including TEAM, BOOST, AntEpiSeeker, SNPRuler, EDCF, HiSeeker, ORF, MLR-tagging, MSCD, and MIC. For multiple-trait GWAS, we describe methods in terms of their employed regression models, dimension-reduction methods, and meta-analysis methods. We also list the advantages and disadvantages of these methods. Finally, we discuss the future directions of SNP data analysis for genome-wide association.

References

[1]
B. S. Shastry, SNP alleles in human disease and evolution, J. Hum. Genet., vol. 47, no. 11, pp. 561-566, 2002.
[2]
Z. P. Cai, H. Sabaa, Y. N. Wang, R. Goebel, Z. Q. Wang, J. F. Xu, P. Stothard, and G. H. Lin, Most parsimonious haplotype allele sharing determination, BMC Bioinformatics, vol. 10, p. 115, 2009.
[3]
N. J. Prescott, S. A. Fisher, A. Franke, J. Hampe, C. M. Onnie, D. Soars, R. Bagnall, M. M. Mirza, J. Sanderson, A. Forbes, et al., A nonsynonymous SNP in ATG16L1 predisposes to Ileal Crohn’s disease and is independent of CARD15 and IBD5, Gastroenterology, vol. 132, no. 5, pp. 1665-1671, 2007.
[4]
S. Seki, Y. Kawaguchi, K. Chiba, Y. Mikami, H. Kizawa, T. Oya, F. Mio, M. Mori, Y. Miyamoto, I. Masuda, et al., A functional SNP in CILP, encoding cartilage intermediate layer protein, is associated with susceptibility to lumbar disc disease, Nat. Genet., vol. 37, no. 6, pp. 607-612, 2005.
[5]
H. Zaimkohan, M. Keramatipour, S. M. H. Ghaderian, J. Tavakkoly-Bazzaz, A. Tahooni, M. Piryaei, N. M. Ghahhari, M. M. Golchin, and M. Ahani, PCSK9 SNP RS11591147 association study with coronary artery disease risk in Iran, Acta Med. Mediterr., vol. 31, p. 1435, 2015.
[6]
X. Guo, N. Yu, F. Gu, X. J. Ding, J. X. Wang, and Y. Pan, Genome-wide interaction-based association of human diseases—A survey, Tsinghua Sci. Technol., vol. 19, no. 6, pp. 596-616, 2014.
[7]
R. J. Klein, C. Zeiss, E. Y. Chew, J. Y. Tsai, R. S. Sackler, C. Haynes, A. K. Henning, J. P. SanGiovanni, S. M. Mane, S. T. Mayne, et al., Complement factor H polymorphism in age-related macular degeneration, Science, vol. 308, no. 5720, pp. 385-389, 2005.
[8]
J. W. Shen, Z. Q. Li, Z. J. Song, J. H. Chen, and Y. Y. Shi, Genome-wide two-locus interaction analysis identifies multiple epistatic SNP pairs that confer risk of prostate cancer: A cross-population study, Int.J. Cancer, vol. 140, no. 9, pp. 2075-2084, 2017.
[9]
M. J. Simmonds and S. C. L. Gough, The HLA region and autoimmune disease: Associations and mechanisms of action, Curr. Genomics, vol. 8, no. 7, pp. 453-465, 2007.
[10]
H. Ueda, J. M. M. Howson, L. Esposito, J. Heward, H. Snook, G. Chamberlain, D. B. Rainbow, K. M. D. Hunter, A. N. Smith, G. Di Genova, et al., Association of the T-cell regulatory gene CTLA4 with susceptibility to autoimmune disease, Nature, vol. 423, no. 6939, pp. 506-511, 2003.
[11]
L. A. Criswell, K. A. Pfeiffer, R. F. Lum, B. Gonzales, J. Novitzke, M. Kern, K. L. Moser, A. B. Begovich, V. E. H. Carlton, W. T. Li, et al., Analysis of families in the multiple autoimmune disease genetics consortium (MADGC) collection: The PTPN22 620W allele associates with multiple autoimmune phenotypes, Am.J. Hum. Genet., vol. 76, no. 4, pp. 561-571, 2005.
[12]
A. Zhernakova, C. C. Van Diemen, and C. Wijmenga, Detecting shared pathogenesis from the shared genetics of immune-related diseases, Nat. Rev. Genet., vol. 10, no. 1, pp. 43-55, 2009.
[13]
R. Saxena, B. F. Voight, V. Lyssenko, N. P. Burtt, P. I. W. De Bakker, H. Chen, J. J. Roix, S. Kathiresan, J. N. Hirschhorn, M. J. Daly, et al., Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels, Science, vol. 316, no. 5829, pp. 1331-1336, 2007.
[14]
R. McPherson, A. Pertsemlidis, N. Kavaslar, A. Stewart, R. Roberts, D. R. Cox, D. A. Hinds, L. A. Pennacchio, A. Tybjaerg-Hansen, A. R. Folsom, et al., A common allele on chromosome 9 associated with coronary heart disease, Science, vol. 316, no. 5830, pp. 1488-1491, 2007.
[15]
A. Helgadottir, G. Thorleifsson, A. Manolescu, S. Gretarsdottir, T. Blondal, A. Jonasdottir, A. Jonasdottir, A. Sigurdsson, A. Baker, A. Palsson, et al., A common variant on chromosome 9p21 affects the risk of myocardial infarction, Science, vol. 316, no. 5830, pp. 1491-1493, 2007.
[16]
N. J. Samani, J. Erdmann, A. S. Hall, C. Hengstenberg, M. Mangino, B. Mayer, R. J. Dixon, T. Meitinger, P. Braund, H. E. Wichmann, et al., Genomewide association analysis of coronary artery disease, N. Engl.J. Med., vol. 357, no. 5, pp. 443-453, 2007.
[17]
L. A. Hindorff, P. Sethupathy, H. A. Junkins, E. M. Ramos, J. P. Mehta, F. S. Collins, and T. A. Manolio, Potential etiologic and functional implications of genome-wide association loci for human diseases and traits, Proc. Natl. Acad. Sci. USA, vol. 106, no. 23, pp. 9362-9367, 2009.
[18]
X. Guo, J. Zhang, Z. P. Cai, D. Z. Du, and Y. Pan, Searching genome-wide multi-locus associations for multiple diseases based on Bayesian inference, IEEE/ACM Trans. Comput. Biol. Bioinform., vol. 14, no. 3, pp. 600-610, 2017.
[19]
X. Guo, J. Zhang, Z. P. Cai, D. Z. Du, and Y. Pan, Dam: A Bayesian method for detecting genome-wide associations on multiple diseases, in Proc. 11th Int. Symp. Bioinformatics Research and Applications, Norfolk, VA, USA, 2015, pp. 96-107.
[20]
B. D. Hobbs, K. De Jong, M. Lamontagne, Y. Bossé, N. Shrine, M. S. Artigas, L. V. Wain, I. P. Hall, V. E. Jackson, A. B. Wyss, et al., Genetic loci associated with chronic obstructive pulmonary disease overlap with loci for lung function and pulmonary fibrosis, Nat. Genet., vol. 49, no. 3, pp. 426-432, 2017
[21]
R. M. Plenge, L. Padyukov, E. F. Remmers, S. Purcell, A. T. Lee, E. W. Karlson, F. Wolfe, D. L. Kastner, L. Alfredsson, D. Altshuler, et al., Replication of putative candidate-gene associations with rheumatoid arthritis in >4,000 samples from North America and Sweden: Association of susceptibility with PTPN22, CTLA4, and PADI4, Am. J. Hum. Genet., vol. 77, no. 6, pp. 1044-1060, 2005.
[22]
C. Kyogoku, W. A. Ortmann, A. Lee, S. Selby, V. E. H. Carlton, M. Chang, P. Ramos, E. C. Baechler, F. M. Batliwalla, J. Novitzke, et al., Genetic association of the R620W polymorphism of protein tyrosine phosphatase PTPN22 with human SLE, Am. J. Hum. Genet., vol. 75, no. 3, pp. 504-507, 2004.
[23]
J. A. Todd, N. M. Walker, J. D. Cooper, D. J. Smyth, K. Downes, V. Plagnol, R. Bailey, S. Nejentsev, S. F. Field, F. Payne, et al., Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes, Nat. Genet., vol. 39, no. 7, pp. 857-864, 2007.
[24]
W. S. Bush, M. T. Oetjens, and D. C. Crawford, Unravelling the human genome-phenome relationship using phenome-wide association studies, Nat. Rev. Genet., vol. 17, no. 3, pp. 129-145, 2016.
[25]
J. MacArthur, E. Bowler, M. Cerezo, L. Gil, P. Hall, E. Hastings, H. Junkins, A. McMahon, A. Milano, J. Morales, et al., The new NHGRI-EBI Catalog of published genome- wide association studies (GWAS Catalog), Nucleic Acids Res., vol. 45, no. D1, pp. D896-D901, 2017.
[26]
S. Sivakumaran, F. Agakov, E. Theodoratou, J. G. Prendergast, L. Zgaga, T. Manolio, I. Rudan, P. McKeigue, J. F. Wilson, and H. Campbell, Abundant pleiotropy in human complex diseases and traits, Am. J. Hum. Genet., vol. 89, no. 5, pp. 607-618, 2011.
[27]
N. Solovieff, C. Cotsapas, P. H. Lee, S. M. Purcell, and J. W. Smoller, Pleiotropy in complex traits: Challenges and strategies, Nat. Rev. Genet., vol. 14, no. 7, pp. 483-495, 2013.
[28]
Y. Zhang and J. S. Liu, Bayesian inference of epistatic interactions in case-control studies, Nat. Genet., vol. 39, no. 9, pp. 1167-1173, 2007.
[29]
W. Li and J. Reich, A complete enumeration and classification of two-locus disease models, Hum. Hered., vol. 50, no. 6, pp. 334-349, 2000.
[30]
D. R. Velez, B. C. White, A. A. Motsinger, W. S. Bush, M. D. Ritchie, S. M. Williams, and J. H. Moore, A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction, Genet. Epidemiol., vol. 31, no. 4, pp. 306-315, 2007.
[31]
X. Zhang, S. P. Huang, F. Zou, and W. Wang, Team: Efficient two-locus epistasis tests in human genome-wide association study, Bioinformatics, vol. 26, no. 12, pp. i217-i227, 2010.
[32]
Y. Wang, G. M. Liu, M. L. Feng, and L. Wong, An empirical comparison of several recent epistatic interaction detection methods, Bioinformatics, vol. 27, no. 21, pp. 2936-2943, 2011.
[33]
X. Wan, C. Yang, Q. Yang, H. Xue, X. D. Fan, N. L. S. Tang, and W. C. Yu, Boost: A fast approach to detecting gene-gene interactions in genome-wide case-control studies, Am. J. Hum. Genet., vol. 87, no. 3, pp. 325-340, 2010.
[34]
H. Matsuda, Physical nature of higher-order mutual information: Intrinsic correlations and frustration, Phys. Rev. E Stat. Phys. Plasmas Fluids Relat. Interdiscip. Topics, vol. 62, no. 3, pp. 3096-3102, 2000.
[35]
L. S. Yung, C. Yang, X. Wan, and W. C. Yu, GBOOST: A GPU-based tool for detecting gene-gene interactions in genome-wide case control studies, Bioinformatics, vol. 27, no. 9, pp. 1309-1310, 2011.
[36]
J. H. Holland, Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. Cambridge, MA, USA: MIT Press, 1992.
[37]
L. Y. Chuang, M. C. Lin, H. W. Chang, and C. H. Yang, Odds ratio-based genetic algorithm for prediction of snp-snp interactions in breast cancer association study, presented at the 26th Int. Conf. Advanced Information Networking and Applications Workshops (WAINA), Fukuoka, Japan, 2012, pp. 920-925.
[38]
J. B. Chen, L. Y. Chuang, Y. D. Lin, C. W. Liou, T. K. Lin, W. C. Lee, B. C. Cheng, H. W. Chang, and C. H. Yang, Genetic algorithm-generated SNP barcodes of the mitochondrial D-loop for chronic dialysis susceptibility, Mitochondrial DNA, vol. 25, no. 3, pp. 231-237, 2014.
[39]
C. H. Yang, Y. D. Lin, L. Y. Chuang, and H. W. Chang, Evaluation of breast cancer susceptibility using improved genetic algorithms to generate genotype SNP barcodes, IEEE/ACM Trans. Comput. Biol. Bioinform., vol. 10, no. 2, pp. 361-371, 2013.
[40]
M. Dorigo and L. M. Gambardella, Ant colonies for the travelling salesman problem, Biosystems, vol. 43, no. 2, pp. 73-81, 1997.
[41]
Y. P. Wang, X. Y. Liu, K. Robbins, and R. Rekaya, AntEpiSeeker: Detecting epistatic interactions for case-control studies using a two-stage ant colony optimization algorithm, BMC Res. Notes, vol. 3, p. 117, 2010.
[42]
L. Y. Chuang, M. C. Lin, H. W. Chang, and C. H. Yang, Analysis of SNP interaction combinations to determine breast cancer risk with PSO, presented at the 11th Int. Conf. Bioinformatics and Bioengineering (BIBE), Taichung, China, 2011, pp. 291-294.
[43]
S. J. Wu, L. Y. Chuang, Y. D. Lin, W. H. Ho, F. T. Chiang, C. H. Yang, and H. W. Chang, Particle swarm optimization algorithm for analyzing SNP-SNP interaction of renin-angiotensin system genes against hypertension, Mol. Biol. Rep., vol. 40, no. 7, pp. 4227-4233, 2013.
[44]
D. H. Kim, S. Uhmn, and J. Kim, Finding relevant SNP sets and predicting disease risk using simulated annealing, Int.J. Softw. Eng. Appl., vol. 6, no. 3, pp. 81-88, 2012.
[45]
R. Agrawal and R. Srikant, Fast algorithms for mining association rules, in Proc. 20th VLDB Conf., Santiago, Chile, 1994, pp. 487-499.
[46]
X. Wan, C. Yang, Q. Yang, H. Xue, N. L. S. Tang, and W. C. Yu, Predictive rule inference for epistatic interaction detection in genome-wide association studies, Bioinformatics, vol. 26, no. 1, pp. 30-37, 2010.
[47]
Y. Wang, G. M. Liu, M. L. Feng, and L. Wong, Response: An empirical comparison of several recent epistatic interaction detection methods, Bioinformatics, vol. 28, no. 1, pp. 147-148, 2012.
[48]
M. Z. Xie, J. Li, and T. Jiang, Detecting genome-wide epistases based on the clustering of relatively frequent items, Bioinformatics, vol. 28, no. 1, pp. 5-12, 2012.
[49]
J. Liu, G. X. Yu, Y. Jiang, and J. Wang, Hiseeker: Detecting high-order SNP interactions based on pairwise SNP combinations, Genes, vol. 8, no. 6, p. 153, 2017.
[50]
W. D. Mao and J. Lee, A combinatorial analysis of genetic data for Crohn’s disease, presented at the 1st Int. Conf. Bioinformatics and Biomedical Engineering, Wuhan, China, 2007, pp. 1031-1034.
[51]
J. W. He and A. Zelikovsky. Multiple linear regression for index SNP selection on unphased genotypes, presented at the 28th Annu. Int. Conf. IEEE Engineering in Medicine and Biology Society, New York, NY, USA, 2006, pp. 5759-5762.
[52]
Z. Z. Feng, X. J. Yang, S. Subedi, and P. D. McNicholas, The lasso and sparse least squares regression methods for SNP selection in predicting quantitative traits, IEEE/ACM Trans. Comput. Biol. Bioinform., vol. 9, no. 2, pp. 629-636, 2012.
[53]
T. T. Wu, Y. F. Chen, T. Hastie, E. Sobel, and K. Lange, Genome-wide association analysis by lasso penalized logistic regression, Bioinformatics, vol. 25, no. 6, pp. 714-721, 2009.
[54]
X. J. Ding, J. X. Wang, A. Zelikovsky, X. Guo, M. Z. Xie, and Y. Pan, Searching high-order SNP combinations for complex diseases based on energy distribution difference, IEEE/ACM Trans. Comput. Biol. Bioinform., vol. 12, no. 3, pp. 695-704, 2015.
[55]
S. Leem, H. H. Jeong, J. Lee, K. Wee, and K. A. Sohn, Fast detection of high-order epistatic interactions in genome-wide association studies using information theoretic measure, Comput. Biol. Chem., vol. 50, pp. 19-28, 2014.
[56]
J. Hodgkin, Seven types of pleiotropy, Int.J. Dev. Biol., vol. 42, no. 3, pp. 501-505, 1998.
[57]
J. F. Liu, Y. F. Pei, C. J. Papasian, and H. W. Deng, Bivariate association analyses for the mixture of continuous and binary traits with the use of extended generalized estimating equations, Genet. Epidemiol., vol. 33, no. 3, pp. 217-227, 2009.
[58]
Q. Yang, H. S. Wu, C. Y. Guo, and C. S, Fox, Analyze multivariate phenotypes in genetic association studies by combining univariate association tests, Genet. Epidemiol., vol. 34, no. 5, pp. 444-454, 2010.
[59]
J. Huang, A. D. Johnson, and C. J. O’donnell, PRIMe: A method for characterization and evaluation of pleiotropic regions from multiple genome-wide association studies, Bioinformatics, vol. 27, no. 9, pp. 1201-1206, 2011.
[60]
M. D. Yuan and G. Q. Diao, Joint association analysis of bivariate quantitative and qualitative traits, BMC Proc., vol. 5, no. S9, p. S74, 2011.
[61]
A. Maity, P. F. Sullivan, and J. I. Tzeng, Multivariate phenotype association analysis by marker-set kernel machine regression, Genet. Epidemiol., vol. 36, no. 7, pp. 686-695, 2012.
[62]
P. F. O’Reilly, C. J. Hoggart, Y. Pomyen, F. C. F. Calboli, P. Elliott, M. R. Jarvelin, and L. J. M. Coin, MultiPhen: Joint model of multiple phenotypes can increase discovery in GWAS, PLoS One, vol. 7, no. 5, p. e34861, 2012.
[63]
P. Marttinen, M. Pirinen, A. P. Sarin, J. Gillberg, J. Kettunen, I. Surakka, A. J. Kangas, P. Soininen, P. O’Reilly, M. Kaakinen, et al., Assessing multivariate gene-metabolome associations with rare variants using Bayesian reduced rank regression, Bioinformatics, vol. 30, no. 14, pp. 2026-2034, 2014.
[64]
Y. F. Wang, A. Y. Liu, J. L. Mills, M. Boehnke, A. F. Wilson, J. E. Bailey-Wilson, M. M. Xiong, C. O. Wu, and R. Z. Fan, Pleiotropy analysis of quantitative traits at gene level by multivariate functional linear models, Genet. Epidemiol., vol. 39, no. 4, pp. 259-275, 2015.
[65]
D. Ray, J. S. Pankow, and S. Basu, USAT: A unified score-based association test for multiple phenotype-genotype analysis, Genet. Epidemiol., vol. 40, no. 1, pp. 20-34, 2016.
[66]
F. P. Casale, B. Rakitsch, C. Lippert, and O. Stegle, Efficient set tests for the genetic analysis of correlated traits, Nat. Methods, vol. 12, no. 8, pp. 755-758, 2015.
[67]
B. L. Wu and J. S. Pankow, Sequence kernel association test of multiple continuous phenotypes, Genet. Epidemiol., vol. 40, no. 2, pp. 91-100, 2016.
[68]
D. D. Lin, J. Y. Li, V. D. Calhoun, and Y. P. Wang, Detection of genetic factors associated with multiple correlated imaging phenotypes by a sparse regression model, presented at the 12th Int. Symp. Biomedical Imaging (ISBI), New York, NY, USA, 2015, pp. 1368-1371.
[69]
B. Bulik-Sullivan, H. K. Finucane, V. Anttila, A. Gusev, F. R. Day, P. R. Loh, L. Duncan, J. R. B. Perry, N. Patterson, E. B. Robinson, et al., An atlas of genetic correlations across human diseases and traits, Nat. Genet., vol. 47, no. 11, pp. 1236-1241, 2015.
[70]
Z. C. Wang, Q. Y. Sha, and S. L. Zhang, Joint analysis of multiple traits using "optimal" maximum heritability test, PLoS One, vol. 11, no. 3, p. e0150975, 2016.
[71]
J. P. Sun, K. Oualkacha, V. Forgetta, H. F. Zheng, J. B. Richards, A. Ciampi, C. M. T. Greenwood, and U. Consortium, A method for analyzing multiple continuous phenotypes in rare variant association studies allowing for flexible correlations in variant effects, Eur.J. Hum. Genet., vol. 24, no. 9, pp. 1344-1351, 2016.
[72]
S. Lee, S. Won, Y. J. Kim, Y. Kim, B. J. Kim, and T. Park, Rare variant association test with multiple phenotypes, Genet. Epidemiol., vol. 41, no. 3, pp. 198-209, 2017.
[73]
X. Zhan, N. Zhao, A. Plantinga, T. A. Thornton, K. N. Conneely, M. P. Epstein, and M. C. Wu, Powerful genetic association analysis for common or rare variants with high-dimensional structured traits, Genetics, vol. 206, no. 4, pp. 1779-1790, 2017.
[74]
L. Klei, D. Luca, B. Devlin, and K. Roeder, Pleiotropy and principal components of heritability combine to increase power for association analysis, Genet. Epidemiol., vol. 32, no. 1, pp. 9-19, 2008.
[75]
H. Mei, W. Chen, A. Dellinger, J. He, M. Wang, C. Yau, S. R. Srinivasan, and G. S. Berenson, Principal-component-based multivariate regression for genetic association studies of metabolic syndrome components, BMC Genet., vol. 11, p. 100, 2010.
[76]
I. Mukhopadhyay, S. Saha, and S. Ghosh, Integrating binary traits with quantitative phenotypes for association mapping of multivariate phenotypes, BMC Proc., vol. 5 Suppl 9, p. S73, 2011.
[77]
C. S. Tang and M. A. R. Ferreira, A gene-based test of association using canonical correlation analysis, Bioinformatics, vol. 28, no. 6, pp. 845-850, 2012.
[78]
J. A. Seoane, C. Campbell, I. N. M. Day, J. P. Casas, and T. R. Gaunt, Canonical correlation analysis for gene-based pleiotropy discovery, PLoS Comput. Biol., vol. 10, no. 10, p. e1003876, 2014.
[79]
J. S. Ried, M. J. Jeff, A. Y. Chu, J. L. Bragg-Gresham, J. Van Dongen, J. E. Huffman, T. S. Ahluwalia, G. Cadby, N. Eklund, J. Eriksson, T. Esko, et al., A principal component meta-analysis on multiple anthropometric traits identifies novel loci for body shape, Nat. Commun., vol. 7, p. 13357, 2016.
[80]
N. Lin, Y. Zhu, R. Z. Fan, and M. M. Xiong, A quadratically regularized functional canonical correlation analysis for identifying the global structure of pleiotropy with NGS data, PLoS Comput. Biol., vol. 13, no. 10, p. e1005788, 2017.
[81]
A. Derkach, J. F. Lawless, and L. Sun, Robust and powerful tests for rare variants using Fisher’s method to combine evidence of association from two or more complementary tests, Genet. Epidemiol., vol. 37, no. 1, pp. 110-121, 2013.
[82]
S. Van Der Sluis, C. V. Dolan, J. Li, Y. Song, P. C. Sham, D. Posthuma, and M. X. Li, MGAS: A powerful tool for multivariate gene-based genome-wide association analysis, Bioinformatics, vol. 31, no. 7, pp. 1007-1015, 2015.
[83]
J. Kim, Y. W. Zhang, and W. Pan, Powerful and adaptive testing for multi-trait and multi-SNP associations with GWAS and sequencing data, Genetics, vol. 203, no. 2, pp. 715-731, 2016.
[84]
A. Cichonska, J. Rousu, P. Marttinen, A. J. Kangas, P. Soininen, T. Lehtimäki, O. T. Raitakari, M. R. Järvelin, V. Salomaa, M. Ala-Korpela, et al., metaCCA: Summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis, Bioinformatics, vol. 32, no. 13, pp. 1981-1989, 2016.
[85]
X. Y. Liang, Z. C. Wang, Q. Y. Sha, and S. L. Zhang, An adaptive fisher’s combination method for joint analysis of multiple phenotypes in association studies, Sci. Rep., vol. 6, p. 34323, 2016.
[86]
B. C. Brown, C. J. Ye, A. L. Price, and N. Zaitlen, Transethnic genetic-correlation estimates from summary statistics, Am.J. Hum. Genet., vol. 99, no. 1, pp. 76-88, 2016.
[87]
I. Y. Kwak and W. Pan, Gene- and pathway-based association tests for multiple traits with GWAS summary statistics, Bioinformatics, vol. 33, no. 1, pp. 64-71, 2016.
[88]
D. Ray and M. Boehnke, Methods for meta-analysis of multiple traits using GWAS summary statistics, Genet. Epidemiol., vol. 42, no. 2, pp. 134-145, 2018.
[89]
Z. H. Liu and X. H. Lin, Multiple phenotype association tests using summary statistics in genome-wide association studies, Biometrics, vol. 74, no. 1, pp. 165-175, 2018.
[90]
D. B. Hall, On the application of extended quasi-likelihood to the clustered data case, Can.J. Stat., vol. 29, no. 1, pp. 77-97, 2001.
[91]
M. C. Wu, S. Lee, T. X. Cai, Y. Li, M. Boehnke, and X. H. Lin, Rare-variant association testing for sequencing data with the sequence kernel association test, Am.J. Hum. Genet., vol. 89, no. 1, pp. 82-93, 2011.
[92]
I. Ionita-Laza, S. Lee, V. Makarov, J. D. Buxbaum, and X. H. Lin, Sequence kernel association tests for the combined effect of rare and common variants, Am.J. Hum. Genet., vol. 92, no. 6, pp. 841-853, 2013.
[93]
X. Zhan, S. Girirajan, N. Zhao, M. C. Wu, and D. Ghosh, A novel copy number variants kernel association test with application to autism spectrum disorders studies, Bioinformatics, vol. 32, no. 23, pp. 3603-3610, 2016.
[94]
K. A. Broadaway, D. J. Cutler, R. Duncan, J. L. Moore, E. B. Ware, M. A. Jhun, L. F. Bielak, W. Zhao, J. A. Smith, P. A. Peyser, et al., A statistical approach for testing cross-phenotype effects of rare variants, Am.J. Hum. Genet., vol. 98, no. 3, pp. 525-540, 2016.
[95]
H. Hotelling, Relations between two sets of variates, Biometrika, vol. 28, nos. 3&4, pp. 321-377, 1936.
[96]
B. Han, J. G. Pouget, K. Slowikowski, E. Stahl, C. H. Lee, D. Diogo, X. Hu, Y. R. Park, E. Kim, P. K. Gregersen, et al., A method to decipher pleiotropy by detecting underlying heterogeneity driven by hidden subgroups applied to autoimmune and neuropsychiatric diseases, Nature Genetics, vol. 48, no. 7, pp. 803-810, 2016.
[97]
K. N. Conneely and M. Boehnke, So many correlated tests, so little time! Rapid adjustment of P values for multiple correlated tests, Am.J. Hum. Genet., vol. 81, no. 6, pp. 1158-1168, 2007.
[98]
J. Kim, Y. Bai, and W. Pan, An adaptive association test for multiple phenotypes with GWAS summary statistics, Genet. Epidemiol., vol. 39, no. 8, pp. 651-663, 2015.
[99]
V. Didelez and N. Sheehan, Mendelian randomization as an instrumental variable approach to causal inference, Statistical Methods in Medical Research, vol. 16, no. 4, pp. 309-330, 2007.
[100]
J. Pearl, Causality: Models, Reasoning, and Inference. Cambridge, UK: Cambridge University Press, 2009.
[101]
M. F. Del Greco, C. Minelli, N. A. Sheehan, and J. R. Thompson, Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome, Stat. Med., vol. 34, no. 21, pp. 2926-2940, 2015.
[102]
J. Bowden, M. F. Del Greco, C. Minelli, G. Davey Smith, N. Sheehan, and J. Thompson, A framework for the investigation of pleiotropy in two-sample summary data Mendelian randomization, Stat. Med., vol. 36, no. 11, pp. 1783-1802, 2017.
[103]
J. Bowden, G. D. Smith, and S. Burgess, Mendelian randomization with invalid instruments: Effect estimation and bias detection through Egger regression, Int. J. Epidemiol., vol. 44, no. 2, pp. 512-525, 2015.
Big Data Mining and Analytics
Pages 173-190
Cite this article:
Ding X, Guo X. A Survey of SNP Data Analysis. Big Data Mining and Analytics, 2018, 1(3): 173-190. https://doi.org/10.26599/BDMA.2018.9020015

1091

Views

56

Downloads

9

Crossref

12

Web of Science

13

Scopus

0

CSCD

Altmetrics

Received: 12 January 2018
Accepted: 17 January 2018
Published: 24 May 2018
© The author(s) 2018
Return