AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (7.9 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access | Just Accepted

CGABepi: A deep learning framework for linear B cell epitope prediction using physicochemical property encoding

Meng Wang1Min Zeng1Wei Fan2Tianrui Wu1Min Li1( )

1 School of Computer Science and Engineering, Central South University, Changsha 410083, China

2 Nuffield Department of Women’s and Reproductive Health, University of Oxford, Oxford, OX39DU, UK

Show Author Information

Abstract

The prediction of linear B cell epitopes is crucial for understanding the mechanisms of B cell immunity, accelerating the screening of B cell epitopes, and expediting the development of related drugs. Most current prediction methods focus on features such as amino acid composition and k-mers, as well as machine learning models. However, these methods usually ignore hidden information in linear B cell epitopes, such as the positional information of amino acids in the sequences and the physicochemical properties of amino acids, thus resulting in poor prediction performance. To address this limitation, we develop CGABepi, a deep learning framework based on amino acid and physicochemical feature encoding. CGABepi employs convolutional neural networks to capture local amino acid associations and BiGRU to capture contextual relationships in sequences. To verify the superiority of the CGABepi architecture, we conduct extensive fair comparative experiments. We train CGABepi on data from two methods (epitope1D and NetBCE), both of which demonstrate significantly better performance than the original method. The ablation study confirms the importance of each module in CGABepi, demonstrating that the CGABepi architecture is well suited for predicting linear B cell epitopes. Additionally, we compare the results on four independent test sets, and CGABepi achieved the best results on all of these test sets. Finally, we successfully predict two epitope datasets for SARS-CoV-1 and SARS-CoV-2 using CGABepi. It is worth noting that out of the 10 epitopes of SARS-CoV-1, 7 epitopes are screened with ultra-high confidence, with predicted scores exceed 99.9%. The multifaceted results demonstrate that CGABepi is currently the state-of-the-art method for linear B cell epitope prediction.

References

【1】
【1】
 
 
Big Data Mining and Analytics

{{item.num}}

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Close
Close
Cite this article:
Wang M, Zeng M, Fan W, et al. CGABepi: A deep learning framework for linear B cell epitope prediction using physicochemical property encoding. Big Data Mining and Analytics, 2026, https://doi.org/10.26599/BDMA.2025.9020126

595

Views

158

Downloads

0

Crossref

0

Web of Science

0

Scopus

0

CSCD

Received: 14 May 2025
Revised: 17 November 2025
Accepted: 22 December 2025
Available online: 11 March 2026

© The author(s) 2026.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).