Survey on Encoding Schemes for Genomic Data Representation and Feature Learning—From Signal Processing to Machine Learning

Ning Yu; Zhihua Li; Zeng Yu

doi:10.26599/BDMA.2018.9020018

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Search articles, authors, keywords, DOl and etc.

Published Date

Reset Search

{{expandStatus?'Exit ':''}}Advanced Search

Journals A - Z

About Us

Publish with Us

Support

PDF (3.2 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

Survey on Encoding Schemes for Genomic Data Representation and Feature Learning—From Signal Processing to Machine Learning

Ning Yu, Zhihua Li, Zeng Yu(

)

∙ Department of Computing Sciences, College at Brockport, State University of New York, Brockport, NY 14422, USA.

∙ Department of Computer Science and Technology at Jiangnan University, Wuxi 214122, China.

∙ School of Information Science and Technology, Southwest Jiaotong University, Chengdu 611756, China.

Show Author Information

Abstract

Data-driven machine learning, especially deep learning technology, is becoming an important tool for handling big data issues in bioinformatics. In machine learning, DNA sequences are often converted to numerical values for data representation and feature learning in various applications. Similar conversion occurs in Genomic Signal Processing (GSP), where genome sequences are transformed into numerical sequences for signal extraction and recognition. This kind of conversion is also called encoding scheme. The diverse encoding schemes can greatly affect the performance of GSP applications and machine learning models. This paper aims to collect, analyze, discuss, and summarize the existing encoding schemes of genome sequence particularly in GSP as well as other genome analysis applications to provide a comprehensive reference for the genomic data representation and feature learning in machine learning.

Keywords

encoding scheme data representation feature learning deep learning genomic signal processing machine learning genome analysis

References

【1】

Crossref Google Scholar

Big Data Mining and Analytics

Volume 1 Issue 3,
September 2018

Pages 191-210

DOI: 10.26599/BDMA.2018.9020018

	{{item.num}}
{{version.versionName}} Author Response
{{version.versionName}} Review comment

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Cite this Report

. . , , {{reviewData.reportCite.doi}}

Cite this article:

Yu N, Li Z, Yu Z. Survey on Encoding Schemes for Genomic Data Representation and Feature Learning—From Signal Processing to Machine Learning. Big Data Mining and Analytics, 2018, 1(3): 191-210. https://doi.org/10.26599/BDMA.2018.9020018

2476

Views

190

Downloads

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Received: 21 January 2018

Accepted: 24 January 2018

Published: 24 May 2018