Lung adenocarcinoma (LUAD) exhibits stage-specific molecular evolution and significant inter-patient het-erogeneity. Many existing driver gene identification methods typically treat LUAD as homogeneous and ignore high-order biological associations. To overcome these limitations, a heterogeneity-aware hypergraph neural network framework was proposed, where (1) a Deep & Cross Network (DCN)-based feature enhancement module was first employed to capture nonlinear cross-feature interactions, (2) an improved hypergraph neural network (HGNN) with hyperedge smoothing loss was conducted to precisely capture high-order gene-patient associations, and (3) an attention-guided dual-path residual fusion module was used to balance raw multi-omics features and hypergraph-learned latent features. Experimental results show that the proposed DPR-EHGNN framework achieves AUC values larger than 0.97 across four LUAD stages, outperforming traditional machine learning methods, GNNs, and state-of-the-art tools significantly. Its predicted pathways (e.g., MAPK signaling) and driver genes (HDAC1, TRAF6, TTN, ANK2) strongly related to LUAD, providing a robust framework to decode LUAD’s dynamic evolution and support personalized therapy in precision oncology.
- Article type
- Year
- Co-author
Open Access
Research Article
Just Accepted
Open Access
Issue
The interactions between circular RNAs (circRNAs) and microRNAs are one of the key mechanisms determining the functions of non-coding RNAs (ncRNAs) in biological processes such as DNA methylation and RNA-induced silencing. Studying these relationships can deepen our understanding of the function of these RNAs’ roles in developing cancer vaccines and designing treatments. Therefore, we propose a knowledge graph enhanced pre-trained Large Language Model (LLM) for predicting circRNA-microRNA interactions. Our approach employs graph contrastive learning to represent a knowledge graph consisting of circRNA and microRNA entities from multi-views. The features of these entities are derived by fine-tuning a sequential LLM by two types of ncRNAs separately. At the final, the embedding is fed into classifier for prediction. We employ an independent testing set to evaluate the model’s performance and against our model with recently reported models on two datasets. Our model achieves approximately a 3% improvement in Area Under the Receiver Operating Characteristic Curve (AUROC), reaching 93.77% and 93.07%, respectively. The stability of our model is tested by performing 10-fold cross-validation on the remaining training set where our model performs the best stability. In ablation study, we comprehensively compare strategies for sequence processing and effectiveness of independent module. Finally, on a case study dataset derived from real-world scenarios, the model assign scores to all candidates and rank them accordingly. Among the top 10 highest-scoring results, 7 have been validated by wet-lab experiments, highlighting the model’s strong generalization capability.
Open Access
Issue
Identifying cancer driver genes has paramount significance in elucidating the intricate mechanisms underlying cancer development, progression, and therapeutic interventions. Abundant omics data and interactome networks provided by numerous extensive databases enable the application of graph deep learning techniques that incorporate network structures into the deep learning framework. However, most existing models primarily focus on individual network, inevitably neglecting the incompleteness and noise of interactions. Moreover, samples with imbalanced classes in driver gene identification hamper the performance of models. To address this, we propose a novel deep learning framework MMGN, which integrates multiplex networks and pan-cancer multiomics data using graph neural networks combined with negative sample inference to discover cancer driver genes, which not only enhances gene feature learning based on the mutual information and the consensus regularizer, but also achieves balanced class of positive and negative samples for model training. The reliability of MMGN has been verified by the Area Under the Receiver Operating Characteristic curves (AUROC) and the Area Under the Precision-Recall Curves (AUPRC). We believe MMGN has the potential to provide new prospects in precision oncology and may find broader applications in predicting biomarkers for other intricate diseases. Implementations of MMGN can be found at https://github.com/xingyili/MMGN.
京公网安备11010802044758号