Journal Home > Volume 18 , Issue 5

Dynamic regulation and packaging of genetic information is achieved by the organization of DNA into chromatin. Nucleosomal core histones, which form the basic repeating unit of chromatin, are subject to various post-translational modifications such as acetylation, methylation, phosphorylation, and ubiquitinylation. These modifications have effects on chromatin structure and, along with DNA methylation, regulate gene transcription. The goal of this study was to determine if patterns in modifications were related to different categories of genomic features, and, if so, if the patterns had predictive value. In this study, we used publically available data (ChIP-chip) for different types of histone modifications (methylation and acetylation) and for DNA methylation for Arabidopsis thaliana and then applied a machine learning based approach (a support vector machine) to demonstrate that patterns of these modifications are very different among different kinds of genomic feature categories (protein, RNA, pseudogene, and transposon elements). These patterns can be used to distinguish the types of genomic features. DNA methylation and H3K4me3 methylation emerged as features with most discriminative power. From our analysis on Arabidopsis, we were able to predict 33 novel genomic features, whose existence was also supported by analysis of RNA-seq experiments. In summary, we present a novel approach which can be used to discriminate/detect different categories of genomic features based upon their patterns of chromatin modification and DNA methylation.


menu
Abstract
Full text
Outline
About this article

Patterns of Chromatin-Modifications Discriminate Different Genomic Features in Arabidopsis

Show Author's information Anuj SrivastavaXiaoyu ZhangSal LaMarca( )Liming CaiRussell L. Malmberg
Institute of Bioinformatics, University of Georgia, Athens, GA 30602-7229, USA
Department of Plant Biology, University of Georgia, Athens, GA 30602-7404, USA
Department of Computer Science, University of Georgia, Athens, GA 30602, USA
The Jackson Laboratory, Bar Harbor, Maine 04609, USA

Abstract

Dynamic regulation and packaging of genetic information is achieved by the organization of DNA into chromatin. Nucleosomal core histones, which form the basic repeating unit of chromatin, are subject to various post-translational modifications such as acetylation, methylation, phosphorylation, and ubiquitinylation. These modifications have effects on chromatin structure and, along with DNA methylation, regulate gene transcription. The goal of this study was to determine if patterns in modifications were related to different categories of genomic features, and, if so, if the patterns had predictive value. In this study, we used publically available data (ChIP-chip) for different types of histone modifications (methylation and acetylation) and for DNA methylation for Arabidopsis thaliana and then applied a machine learning based approach (a support vector machine) to demonstrate that patterns of these modifications are very different among different kinds of genomic feature categories (protein, RNA, pseudogene, and transposon elements). These patterns can be used to distinguish the types of genomic features. DNA methylation and H3K4me3 methylation emerged as features with most discriminative power. From our analysis on Arabidopsis, we were able to predict 33 novel genomic features, whose existence was also supported by analysis of RNA-seq experiments. In summary, we present a novel approach which can be used to discriminate/detect different categories of genomic features based upon their patterns of chromatin modification and DNA methylation.

Keywords: machine learning, chromatin modification, DNA methylation, support vector machine, Arabidopsis

References(24)

[1]
B. E.Bernstein, E. L.Humphrey, R. L.Erlich, R.Schneider, P.Bouman, J. S.Liu, T.Kouzarides, and S. L.Schreiber, Methylation of histone H3 Lys 4 in coding regions of active genes, Proc. Natl. Acad. Sci. USA, vol. 99, no. 13, pp. 8695-8700, 2002.
[2]
K.Luger, A. W.Mäder, R. K.Richmond, D. F.Sargent, and T. J.Richmond, Crystal structure of the nucleosome core particle at 2.8 A resolution, Nature, vol. 389, pp. 251-260, 1997.
[3]
Y.Zhangand D.Reinberg, Transcription regulation by histone methylation: Interplay between different covalent modifications of the core histone tails, Genes Dev., vol. 15, pp. 2343-2360, 2001.
[4]
J.Bender, DNA methylation and epigenetics, Annu. Rev. Plant Biol., vol. 55, pp. 41-68, 2004.
[5]
J.Paszkowskiand S. A.Whitham, Gene silencing and DNA methylation processes, Curr. Opin. Plant Biol., vol. 4, pp. 123-129, 2001.
[6]
X.Zhang, J.Yazaki, A.Sundaresan, S.Cokus, S. W.-L.Chan, H.Chen, I. R.Henderson, P.Shinn, M.Pellegrini, S. E.Jacobsen, and J. R.Ecker, Genome-wide high-resolution mapping and functional analysis of DNA methylation in Arabidopsis, Cell, vol. 126, pp. 1189-1201, 2006.
[7]
R. K.Chodavarapu, S.Feng, Y. V.Bernatavichute, P.-Y.Chen, H.Stroud, Y.Yu, J. A.Hetzel, F.Kuo, J.Kim, S. J.Cokus, et al., Relationship between nucleosome positioning and DNA methylation, Nature, vol. 466, pp. 388-392, 2010.
[8]
T.Kouzarides, Chromatin modifications and their function, Cell, vol. 128, pp. 693-705, 2007.
[9]
P. J.Park, ChIP-seq: Advantages and challenges of a maturing technology, Nat. Rev. Genet., vol. 10, pp. 669-680, 2009.
[10]
N. V.Vapnik, The Nature of Statistical Learning Theory. Springer, 1995.
[11]
Z.Barutcuoglu, R. E.Schapire, and O.G.Troyanskaya, Hierarchical multi-label prediction of gene function, Bioinformatics, vol. 22, pp. 830-836, 2006.
[12]
N.Bhardwaj, R. E.Langlois, G.Zhao, and H.Lu, Kernel-based machine learning protocol for predicting DNA-binding proteins, Nucleic Acids Res., vol. 33, pp. 6486-6493, 2005.
[13]
A.Hoglund, P.Dönnes, T.Blum, H.-W.Adolph, and O.Kohlbacher, MultiLoc: Prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition, Bioinformatics, vol. 22, pp. 1158-1165, 2006.
[14]
C.Costas, M.de la Paz Sanchez, H.Stroud, Y.Yu, J. C.Oliveros, S.Feng, A.Benguria, I.L髉ez-Vidriero, X.Zhang, R.Solano, S. E.Jacobsen, andC.Gutierrez, Genome-wide mapping of Arabidopsis thaliana origins of DNA replication and their associated epigenetic marks, Nat. Struct. Mol. Biol., vol. 18, pp. 395-400, 2011.
[15]
L.Kong, Y.Zhang, Z.-Q.Ye, X.-Q.Liu, S.-Q.Zhao, L.Wei, and G.Gao, CPC: Assess the protein-coding potential of transcripts using sequence features and support vector machine, Nucleic Acids Res., vol. 35, pp. W345-W349, 2007.
[16]
X. Y.Zhang, Y. V.Bernatavichute, S.Cokus, M.Pellegrini, and S. E.Jacobsen, Genome-wide analysis of mono-, diand trimethylation of histone H3 lysine 4 in Arabidopsis thaliana, Genome Biol., vol. 10, pp. R62.1-R62.14, 2009.
[17]
H. K.Jiand W. H.Wong, TileMap: Create chromosomal map of tiling array hybridizations, Bioinformatics, vol. 21, pp. 3629-3636, 2005.
[18]
R.Lister, R. C.O’Malley, J.Tonti-Filippini, B. D.Gregory, C. C.Berry, A. H.Millar, and J. R.Ecker, Highly integrated single-base resolution maps of the epigenome in Arabidopsis, Cell, vol. 133, pp. 523-536, 2008.
[19]
B.Langmead, C.Trapnell, M.Pop, and S. LSalzberg, Ultrafast and memory-efficient alignment of short DNA sequences to the human genome, Genome Biol., vol. 10, 2009.
[20]
C.Trapnell, L.Pachter, and S. L.Salzberg, TopHat: Discovering splice junctions with RNA-Seq, Bioinformatics, vol. 25, pp. 1105-1111, 2009.
[21]
C.Changand C.Lin, LIBSVM: A library for support vector machines, ACM Transactions on Intelligent Systems and Technology, vol. 2, pp. 1-27, 2011.
[22]
T. F.Wu, C. J.Lin, and R. C.Weng, Probability estimates for multi-class classification by pairwise coupling, J. Mach. Learn. Res., vol. 5, pp. 975-1005, 2004.
[23]
X. Y.Li, X.Wang, K.He, Y.Ma, N.Su, H.He, V.Stolc, W.Tongprasit, W.Jin, J.Jiang, W.Terzaghi, S.Li, and X. W.Deng, High-resolution mapping of epigenetic modifications of the rice genome uncovers interplay between DNA methylation, histone methylation, and gene expression, Plant Cell, vol. 20, pp. 259-276, 2008.
[24]
Z.Wang, C.Zang, J. ARosenfeld, D. E.Schones, A.Barski, S.Cuddapah, K.Cui, T.-Y.Roh, W.Peng, M. Q.Zhang, and K.Zhao, Combinatorial patterns of histone acetylations and methylations in the human genome, Nat. Genet., vol. 40, pp. 897-903, 2008.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 02 August 2013
Revised: 01 September 2013
Accepted: 02 September 2013
Published: 03 October 2013
Issue date: October 2013

Copyright

© The author(s) 2013

Acknowledgements

This work was supported by the National Science Foundation of USA (No. IIS 0916250); and The University of Georgia Franklin College of Arts & Sciences research fund.

Rights and permissions

Return