Sort:
Regular Paper Issue
scMCG: Analyzing a Single-Cell Assay for Transposase-Accessible Chromatin Using Sequencing Data Based on Contrastive Learning and Generative Adversarial Network
Journal of Computer Science and Technology 2025, 40(6): 1639-1649
Published: 01 November 2025
Abstract Collect

The development of single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) has significantly advanced the study of cell heterogeneity in the epigenetic landscape. Numerous studies have leveraged scATAC-seq data to explore deeper gene regulatory relationships. However, scATAC-seq usually faces dropout events which may result in data sparsity and noise. In this work, we propose a method (scMCG) for analyzing scATAC-seq data that employs contrastive learning and a generative adversarial network (GAN). First, the scMCG method uses two distinct encoders for contrastive learning to solve the issues of feature redundancy and data sparsity in scATAC-seq data. Subsequently, a generator is used to reconstruct the latent embedding. Finally, a decoder is used to generate binary accessibility. We conduct experiments on multiple scATAC-seq datasets. The results demonstrate that the scMCG method achieves excellent performance in multiple tasks such as cell clustering and transcription factor activity influence.

Open Access Issue
CLTDA: Identifying Associations Between tsRNAs and Diseases Based on Contrastive Learning
Big Data Mining and Analytics 2025, 8(6): 1324-1334
Published: 19 September 2025
Abstract PDF (1.2 MB) Collect
Downloads:440

Increasing evidences have highlighted the significant association between tsRNAs and diseases. Predicting potential tsRNA-disease associations based on computational methods can effectively reduce human and resource consumption. However, there is a scarcity of computational methods for predicting tsRNA-disease associations. Therefore, we propose Contrastive Learning-based prediction of tsRNA-Disease Associations (CLTDA). It reconstructs known associations between tsRNAs and diseases based on adaptive Singular Value Decomposition (SVD). Then, we employ Graph Convolutional Networks (GCNs) for feature extraction from both the original and reconstructed tsRNA-disease associations, and optimize the GCNs by using contrastive learning loss and Bayesian Personalized Ranking (BPR) loss. In addition, the Bayesian negative sampling method is used to select high-quality negative samples for learning the features of tsRNA and disease. Finally, a Multi-Layer Perceptron (MLP) is utilized to calculates the score of potential association. We conduct five-fold cross-validation and denovo experiments on a manually collected tsRNA-disease association dataset, and the experimental results show that CLTDA outperforms the other six state-of-the-art methods. We also perform a case study on lung cancer and experimental results show that CLTDA is an effective tool for predicting potential associations between tsRNAs and diseases.

Open Access Issue
Transformer-Based Single-Cell Language Model: A Survey
Big Data Mining and Analytics 2024, 7(4): 1169-1186
Published: 04 December 2024
Abstract PDF (1.2 MB) Collect
Downloads:259

The transformers have achieved significant accomplishments in the natural language processing as its outstanding parallel processing capabilities and highly flexible attention mechanism. In addition, increasing studies based on transformers have been proposed to model single-cell data. In this review, we attempt to systematically summarize the single-cell language models and applications based on transformers. First, we provide a detailed introduction about the structures and principles of transformers. Then, we review the single-cell language models and large language models for single-cell data analysis. Moreover, we explore the datasets and applications of single-cell language models in downstream tasks, such as batch correction, cell clustering, cell type annotation, gene regulatory network inference, and perturbation response. Further, we discuss the challenges of single-cell language models and provide promising research directions. We hope this review will serve as an up-to-date reference for researchers who are interested in the direction of single-cell language models.

Total 3