Hierarchically Clustered HMM for Protein Sequence Motif Extraction with Variable Length

Cody Hudson; Bernard Chen; Dongsheng Che

doi:10.1109/TST.2014.6961032

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Search articles, authors, keywords, DOl and etc.

Published Date

Reset Search

{{expandStatus?'Exit ':''}}Advanced Search

Journals A - Z

About Us

Publish with Us

Support

PDF (6.7 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

Hierarchically Clustered HMM for Protein Sequence Motif Extraction with Variable Length

Cody Hudson, Bernard Chen(

), Dongsheng Che

Department of Computer Science, University of Central Arkansas, Conway, AR 72034, USA.

Department of Computer Science, East Stroudsburg University, East Stroudsburg, PA 18301, USA.

Show Author Information

Abstract

Protein sequence motifs extraction is an important field of bioinformatics since its relevance to the structural analysis. Two major problems are related to this field: (1) searching the motifs within the same protein family; and (2) assuming a window size for the motifs search. This work proposes the Hierarchically Clustered Hidden Markov Model (HC-HMM) approach, which represents the behavior and structure of proteins in terms of a Hidden Markov Model chain and hierarchically clusters each chain by minimizing distance between two given chains’ structure and behavior. It is well known that HMM can be utilized for clustering, however, methods for clustering on Hidden Markov Models themselves are rarely studied. In this paper, we developed a hierarchical clustering based algorithm for HMMs to discover protein sequence motifs that transcend family boundaries with no assumption on the length of the motif. This paper carefully examines the effectiveness of this approach for motif extraction on 2593 proteins that share no more than 25% sequence identity. Many interesting motifs are generated. Three example motifs generated by the HC-HMM approach are analyzed and visualized with their tertiary structure. We believe the proposed method provides a unique protein sequence motif extraction strategy. The related data mining fields using Hidden Markova Model may also benefit from this clustering on HMM themselves approach.

Keywords

Hidden Markov Model hierarchical clustering sequential motif bioinformatics

References

【1】

Crossref Google Scholar

Tsinghua Science and Technology

Volume 19 Issue 6,
December 2014

Pages 635-647

DOI: 10.1109/TST.2014.6961032

	{{item.num}}
{{version.versionName}} Author Response
{{version.versionName}} Review comment

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Cite this Report

. . , , {{reviewData.reportCite.doi}}

Cite this article:

Hudson C, Chen B, Che D. Hierarchically Clustered HMM for Protein Sequence Motif Extraction with Variable Length. Tsinghua Science and Technology, 2014, 19(6): 635-647. https://doi.org/10.1109/TST.2014.6961032

1044

Views

Downloads

Crossref

N/A

Web of Science

Scopus

CSCD

Google Scholar
Citation

Received: 23 June 2014

Accepted: 30 June 2014

Published: 20 November 2014

The Author(s)