AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (838.1 KB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Methods for Population-Based eQTL Analysis in Human Genetics

Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, NC 28223, USA.
Show Author Information

Abstract

Gene expression is a critical process in biological system that is influenced and modulated by many factors including genetic variation. Expression Quantitative Trait Loci (eQTL) analysis provides a powerful way to understand how genetic variants affect gene expression. For genome wide eQTL analysis, the number of genetic variants and that of genes are large and thus the search space is tremendous. Therefore, eQTL analysis brings about computational and statistical challenges. In this paper, we provide a comprehensive review of recent advances in methods for eQTL analysis in population-based studies. We first present traditional pairwise association methods, which are widely used in human genetics. To account for expression heterogeneity, we investigate the methods for correcting confounding factors. Next, we discuss newly developed statistical learning methods including Lasso-based models. In the conclusion, we provide an overview of future method development in analyzing eQTL associations. Although we focus on human genetics in this review, the methods are applicable to many other organisms.

References

[1]
M. V. Rockman and L. Kruglyak, Genetics of global gene expression, Nat. Rev. Genet., vol. 7, pp. 862-872, 2006.
[2]
W. Cookson, L. Liang, G. Abecasis, M. Moffatt, and M. Lathrop, Mapping complex disease traits with global gene expression, Nat. Rev. Genet., vol. 10, no. 3, pp. 184-194, 2009.
[3]
V. G. Cheung and R. S. Spielman, Genetics of human gene expression: Mapping DNA variants that influence gene expression, Nat. Rev. Genet., vol. 10, no. 9, pp. 595-604, 2009.
[4]
B. E. Stranger, M. S. Forrest, A. G. Clark, M. J. Minichiello, S. Deutsch, R. Lyle, S. Hunt, B. Kahl, S. E. Antonarakis, S. Tavar, P. Deloukas, and E. T. Dermitzakis, Genome-wide associations of gene expression variation in humans, PLoS Genet., vol. 1, no. 6, p. e78, 2005.
[5]
B. E. Stranger, M. S. Forrest, M. Dunning, C. E. Ingle, C. Beazley, N. Thorne, R. Redon, C. P. Bird, A. de Grassi, C. Lee, et al., Relative impact of nucleotide and copy number variation on gene expression phenotypes, Science, vol. 315, no. 5813, pp. 848-853, 2007.
[6]
A. Schlattl, S. Anders, S. M. Waszak, W. Huber, and J. O. Korbel, Relating CNVs to transcriptome data at fine resolution: Assessment of the effect of variant size, type, and overlap with functional regions, Genome Res., vol. 21, no. 12, pp. 2004-2013, 2011.
[7]
Y. Benjamini and Y. Hochberg, Controlling the false discovery rate: A practical and powerful approach to multiple testing, Journal of the Royal Statistical Society, Series B, vol. 57, no. 1, p. 289300, 1995.
[8]
GTEx Consortium, The Genotype-Tissue Expression (GTEx) project, Nat. Genet., vol. 45, no. 6, pp. 580-585, 2013.
[9]
B. E. Stranger, S. B. Montgomery, A. S. Dimas, L. Parts, O. Stegle, C. E. Ingle, M. Sekowska, G. D. Smith, D. Evans, M. Gutierrez-Arcelus, et al., Patterns of cis regulatory variation in diverse human populations, PLoS Genet., vol. 8, no. 4, p. e1002639, 2012. .
[10]
Q. Li, J. H. Seo, B. Stranger, A. McKenna, I. Pe’er, T. Laframboise, M. Brown, S. Tyekucheva, and M. L. Freedman, Integrative eQTL-based analyses reveal the biology of breast cancer risk loci, Cell, vol. 152, no. 3, pp. 633-641, 2013.
[11]
S. B. Montgomery, M. Sammeth, M. Gutierrez-Arcelus, R. P. Lach, C. Ingle, J. Nisbett, R. Guigo, and E. T. Dermitzakis, Transcriptome genetics using second generation sequencing in a Caucasian population, Science, vol. 464, no. 7289, pp. 773-777, 2010.
[12]
J. K. Pickrell, J. C. Marioni, A. A. Pai, J. F. Degner, B. E. Engelhardt, E. Nkadori, J. B. Veyrieras, M. Stephens, Y. Gilad, and J. K. Pritchard, Understanding mechanisms underlying human gene expression variation with RNA sequencing, Science, vol. 464, no. 7289, pp. 768-772, 2010.
[13]
T. Lappalainen, M. Sammeth, M. R. Friedländer, P. A. ’t Hoen, J. Monlong, M. A. Rivas, M. González-Porta, N. Kurbatova, T. Griebel, P. G. Ferreira, et al., Transcriptome and genome sequencing uncovers functional variation in humans, Nature, vol. 501, no. 7468, pp. 506-522, 2013.
[14]
L. Liang, N. Morar, A. L. Dixon, G. M. Lathrop, G. R. Abecasis, M. F. Moffatt, and W. O. Cookson, A cross-platform analysis of 14 177 expression quantitative trait loci derived from lymphoblastoid cell lines, Genome Res., vol. 23, no. 4, pp. 716-726, 2013. .
[15]
A. Kreimer and I. Pe’er, Variants in exons and in transcription factors affect gene expression in trans, Genome Biol., vol. 14, no. 7, p. R71, 2013.
[16]
V. G. Cheung, R. R. Nayak, I. X. Wang, S. Elwyn, S. M. Cousins, M. Morley, and R. S. Spielman, Polymorphic cis- and trans-regulation of human gene expression, PLoS Biol., vol. 8, no. 9, p. e1000480, 2010.
[17]
G. R. Abecasis, S. S. Cherny, W. O. Cookson, and L. R. Cardon, Merlin-rapid analysis of dense genetic maps using sparse gene flow trees, Nat. Genet., vol. 30, no. 1, pp. 97-101, 2002.
[18]
K. W. Broman, H. Wu, S. Sen, and G. A. Churchill, R/qtl: QTL mapping in experimental crosses, Bioinformatics, vol. 19, p. 889, 2003.
[19]
L. E. Baum, T. Petrie, G. Soules, and N. Weiss, A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains, The Annals of Mathematical Statistics, vol. 41, no. 1, pp. 164-171, 1970.
[20]
S. Purcell, B. Neale, K. Todd-Brown, L. Thomas, M. A. Ferreira, D. Bender, J. Maller, P. Sklar, P. I. de Bakker, M. J. Daly, et al., PLINK: A tool set for whole-genome association and population-based linkage analyses, Am. J. Hum. Genet., vol. 81, no. 3, pp. 559-575, 2007.
[21]
D. Leung, An R package for analysis of whole-genome association studies, Hum. Hered., vol. 64, pp. 45-51, 2007.
[22]
C. Haley and S. Knott, A simple regression method for mapping quantitative trait loci in line crosses using flanking markers, Heredity, vol. 69, pp. 315-324, 1992.
[23]
D. M. Gatti, A. A. Shabalin, T. C. Lam, F. A. Wright, I. Rusyn, and A. B. Nobel, FastMap: Fast eQTL mapping in homozygous populations, Bioinformatics, vol. 25, no. 4, pp. 482-489, 2009.
[24]
A. A. Shabalin, Matrix eQTL: Ultra fast eQTL analysis via large matrix operations, Bioinformatics, vol. 28, no. 10, pp. 1353-1358, 2012.
[25]
H. M. Kang, C. Ye, and E. Eskin, Accurate discovery of expression quantitative trait loci under confounding from spurious and genuine regulatory hotspots, Genetics, vol. 180, no. 4, pp. 1909-1925, 2008. .
[26]
H. M. Kang, N. A. Zaitlen, A. Kirby, C. M. Wade, D. Heckerman, M. Daly, and E. Eskin, Efficient control for population structure in model organism association mapping, Genetics, vol. 178, pp. 1709-1723, 2008.
[27]
J. T. Leek and J. D. Storey, Capturing heterogeneity in gene expression studies by surrogate variable analysis, PLoS Genet., vol. 3, no. 9, pp. 1724-1735, 2007.
[28]
J. Listgarten, C. Kadie, E. Schadt, and D. Heckerman, Correction for hidden confounders in the genetic analysis of gene expression, Proc. Natl. Acad. Sci. USA, vol. 107, p. 16465, 2010.
[29]
O. Stegle, L. Parts, R. Durbin, and J. Winn, A Bayesian framework to account for complex non-genetic factors in gene expression levels greatly increases power in eQTL studies, PLoS Comput. Biol., vol. 6, no. 5, p. e1000770, 2010.
[30]
O. Stegle, L. Parts, M. Piipari, J. Winn, and R. Durbin, Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses, Nat. Protoc., vol. 7, no. 3, pp. 500-507, 2012. .
[31]
N. Fusi, O. Stegle, and N. D. Lawrence, Joint modelling of confounding factors and prominent genetic regulators provides increased accuracy in genetical genomics studies, PLoS Comput. Biol., vol. 8, no. 1, p. e1002330, 2012. .
[32]
R. Tibshirani, Regression shrinkage and selection via the lasso, J. Royal. Statist. Soc. B., vol. 58, no. 1, pp. 267-288, 1996.
[33]
E. J. Candes, M. B. Wakin, and S. P. Boyd, Enhancing sparsity by reweighted l1 minimization, Journal of Fourier Analysis and Applications, vol. 14, pp. 877-905, 2008.
[34]
R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight, Sparsity and smoothness via the fused lasso, Journal of the Royal Statistical Society, pp. 91-108, 2005.
[35]
S. Kim and E. P. Xing, Statistical estimation of correlated genome associations to a quantitative trait network, PLoS Genetics, vol. 5, no. 8, 2009.
[36]
S. Kim and E. P. Xing, Tree-guided group lasso for multi-task regression with structured sparsity, in The 27th International Conference on Machine Learning (ICML), 2010.
[37]
S. Lee, J. Zhu, and E. P. Xing, Adaptive multi-task lasso: With application to eQTL detection, in The 24th Annual Conference on Neural Information Processing Systems (NIPS), 2010.
[38]
X. Chen, S. Kim, Q. Lin, J. G. Carbonell, and E. P. Xing, Graph-structured multi-task regression and an efficient optimization method for general fused lasso, arXiv: 1005. 3579, 2010.
[39]
X. Chen, X. Shi, X. Xu, Z. Wang, R. E. Mills, C. Lee, and J. Xu, A two-graph guided multi-task lasso approach for eQTL mapping, in Proceedings of the 15th International Conference of Artificial Intelligence and Statistics (AISTATS 2012), La Palma, Canary Islands, 2012, pp. 208-217.
[40]
S. Lee and E. P. Xing, Leveraging input and output structures for joint mapping of epistatic and marginal eQTLs, Bioinformatics, vol. 28, no. 12, pp. i137-i146, 2012.
[41]
G. Obozinski, B. Taskar, and M. Jordan, Joint covariate selection for grouped classification, Technical Report, Department of Statistics, University of California, Berkeley, USA, 2006.
[42]
B. Rakitsch, C. Lippert, O. Stegle, K. Borgwardt, A Lasso multi-marker mixed model for association mapping with population structure correction, Bioinformatics, vol. 29, no. 2, pp. 206-214, 2013.
[43]
Z. Wang, J. Xu, and X. Shi, Finding alternative eQTLs by exploring sparse model space, Journal of Computational Biology, vol. 21, no. 5, pp. 385-393, 2014.
[44]
W. Cheng, X. Zhang, W. Wang, Y. Wu, X. Yin, J. Li, and D. Heckerman, Inferring novel associations between SNP sets and gene sets in eQTL study using sparse graphical model, in Proceedings of the ACM International Conference on Bioinformatics and Computational Biology (ACMBCB), 2012, pp. 466-472.
[45]
X. Zhang, W. Cheng, J. Listgarten, C. Kadie, S. Huang, W. Wang, and H. Heckerman, Learning transcriptional regulatory relationships using sparse graphical models, PLoS One, 2012. .
[46]
L. Zhang and S. Kim, Learning gene networks under SNP perturbations using eQTL datasets, PLoS Computational Biology, 2014. .
[47]
J. J. Michaelson, S. Loguercio, and A. Beyer, Detection and interpretation of expression quantitative trait loci (eQTL), Methods, vol. 48, no. 3, pp. 265-276, 2009.
[48]
S. Loguercio, R. W. Overall, J. J. Michaelson, T. Wiltshire, M. T. Pletcher, B. H. Miller, J. R. Walker, G. Kempermann, A. I. Su, and A. Beyer, Integrative analysis of low- and high-resolution eQTL, PLoS One, vol. 5, no. 11, p. e13920, 2010. .
[49]
J. J. Michaelson, R. Alberts, K. Schughart, and A. Beyer, Data driven assessment of eQTL mapping methods, BMC Genomics, vol. 11, p. 502, 2010. .
[50]
T. Flutre, X. Wen, J. Pritchard, and M. Stephens, A statistical framework for joint eQTL analysis in multiple tissues, PLoS Genet., vol. 9, no. 5, p. e1003486, 2013.
[51]
J. H. Sul, B. Han, C. Ye, T. Choi, and E. Eskin, Effectively identifying eQTLs from multiple tissues by combining mixed model and meta-analytic approaches, PLoS Genet., vol. 9, no. 6, p. e1003491, 2013. .
Tsinghua Science and Technology
Pages 624-634
Cite this article:
Tian L, Quitadamo A, Lin F, et al. Methods for Population-Based eQTL Analysis in Human Genetics. Tsinghua Science and Technology, 2014, 19(6): 624-634. https://doi.org/10.1109/TST.2014.6961031

568

Views

22

Downloads

14

Crossref

N/A

Web of Science

15

Scopus

0

CSCD

Altmetrics

Received: 17 June 2014
Accepted: 24 June 2014
Published: 20 November 2014
The Author(s)
Return