Cardiac multi-class segmentation is of great significance in medical imaging, which can provide accurate cardiac structure information and assist clinical diagnosis. However, in the training of multi-class semantic segmentation models with high-resolution cardiac images, the loss of deep features due to multiple downsampling operations leads to the problems oforgan discontinuity and incorrect edge segmentation in the segmented cardiac. To address this, this paper proposes a 3DCSNet based on self-attention and 3D convolution for cardiac multi-class segmentation. Specifically, our proposed network introduces the 3D feature fusion module and a 3D spatial perception module into the segmentation network. The former 3D feature fusion module integrates self-attention and 3D convolution for parallel feature extraction, which is able to efficiently allocate the attentions weights within and between channels under the same dimension of the feature map. The latter 3D spatial perception module captures the positional correlation information between different dimensions by integrating the self-attention mechanism, avoiding the loss of important information in downsampling and further retaining the deep key features. Experimental results show that the proposed 3DCSNet outperforms several existing models on a publicly available 3D computed tomography image dataset (ImageCHD) .
- Article type
- Year
- Co-author
Open Access
Issue
Open Access
Issue
Coarctation of aorta (CoA) is a congenital malformation of the aortic arch with a poor natural prognosis, which requires early intervention and even emergency surgery. Meanwhile, postoperative aortic re-coarctation is still a possible problem. At present, the prediction of aortic re-coarctation is mainly carried out based on the risk factor analysis of doctors on the clinical characteristics of patients combining with echocardiography (Ultra Sound Cardiogram) data , which is easy to be misdiagnosed. In this paper, a multimodal data detection framework based on Swin-Unet network is proposed based on the images of the patient's heart from computed tomography (CT) combining with the patient's clinical data. The framework carries out multimodal feature fusion analysis by combining the Swin-Unet network and the machine learning models , aiming to perform early detection of aortic re-coarctation. The experimental results on the clinical dataset show that our proposed methodeffectively improves the prediction effect of aortic re-coarctation when compared with the traditional prediction methods using clinical data. Particularly, we verifie the risk factors related to re-coarctation, the results of which provides a reference for clinical medicine.
Open Access
Issue
Accurate segmentation of the left ventricular endocardium from cardiac magnetic resonance imaging to obtain the left ventricular region is an important step in the analysis of cardiac function. It is noted that reinforcement learning is prone to localization deviation in left ventricular endocardium segmentation by locating the left ventricular endocardial edges, leading to a performance decrease in the segmentation. To address this, this paper proposes a direction-constrained reinforcement learning method for left ventricular endocardium segmentation, which divides the segmentation task into two stages. In the first stage, the proposed method extracts global edge features of the endocardium, and in the second stage, reinforcement learning is used to iteratively locate the endocardial edge points to obtain the edge, obtaining the segmented left ventricular endocardium. The proposed method constrains the direction of agent positioning, which can reduce the localization deviation and overlap, such that the segmentation accuracy can be improved. Finally, the experimental results on two public datasets, including the Automated Cardiac Diagnosis Challenge (ACDC) and Sunnybrook Cardiac MR Left Ventricle Segmentation Challenge (Sunnybrook) , show that the proposed method has higher accuracy than the compared methods. Specifically, the F1-score of the proposed method are 0.9482 and 0.9387, and the Average perpendicular distance (APD) are 3.5863 and 4.9447, which can effectively segment the left ventricular endocardium.
Open Access
Issue
In the method of constructing hypergraphs for Alzheimer’s disease (AD) classification using the average blood oxygen level dependent (BOLD) sequences, there exists a problem where hypergraphs constructed based on a limited number of time points lead to the loss of critical details in the regions of interest (ROI) of the subjects’ brains, a multi-hypergraph fusion optimization model for AD classification is proposed. The model employs a sliding window approach on BOLD sequences to sequentially extract nonlinear high-order relationships between various brain regions within the window to construct multiple hypergraphs, considering the subtle differences in feature vectors of hyperedges across window dimensions, extract and fuse hypergraph features based on the functional connectivity and similarity relationships between hyperedges, and build a fMRI hypergraph attention neural network (FHyperGAT) that incorporates attention mechanisms to identify the functional connectivity features between brain regions within the fused hypergraph data. Experimental results demonstrate that the method proposed in this research has improved the classification performance on the AD/normal control (NC) classification task by 10 percentage points compared with the hypergraph convolutional network model (HyperGCN) , proving the effectiveness of the model.
Open Access
Issue
Alzheimer’s disease (AD) , as a progressive neurodegenerative disorder, presents significant challenges in early diagnosis and clinical intervention. In medical imaging, structural magnetic resonance imaging (sMRI) captures brain atrophy and structural alterations through high-resolution anatomical imaging, while fluorodeoxyglucose positron emission tomography (FDG-PET) effectively reflects functional changes by monitoring cerebral glucose metabolism. These two modalities hold complementary value in detecting AD-related pathological brain changes. However, existing multimodal AD classification models are limited by suboptimal feature fusion, insufficient inter-modal information interaction, and feature distribution discrepancies, hindering their diagnostic utility. To address these issues, a bimodal iterative cross-attention fusion ensemble framework (BICAFEF) is proposed. This framework comprises base classifiers and a meta-classifier. The base classifiers employ ResNet modules to extract features from sMRI and FDG-PET image patches. A spatial feature shrinking (SFS) module, integrating convolutional operations and adaptive aggregation pooling, is designed to reduce inter-modal redundancy and emphasize discriminative features. Additionally, an iterative cross-attention mechanism is constructed to dynamically capture and reinforce global dependencies and complementary information across modalities through multi-round iterations, thereby resolving the challenge of insufficiently exploiting inter-modal synergies and enhancing AD classification performance. To further improve whole-brain classification accuracy, the framework incorporates a meta-classifier to screen and ensemble base classifiers by discarding those with accuracy below 75%, retaining high-performance classifiers to boost robustness and precision. Visualization analyses validate the framework’s focus on critical brain regions, demonstrating its capability to effectively identify AD-related pathological areas in sMRI and PET modalities. Experimental results show that the framework achieves a five-fold classification accuracy (ACC) of 94.3%, sensitivity (SEN) of 92.6%, specificity (SPE) of 96.3%, AUC of 97.5%, and Matthews correlation coefficient (MCC) of 88.7% in AD vs. healthy control (HC) classification, outperforming state-of-the-art multimodal frameworks.
Open Access
Issue
The accurate segmentation of lung tumors plays a crucial role in tumor diagnosis and treatment. However, lung tumor segmentation is often challenged by several issues such as low contrast between lesions and surrounding tissues, tumor-normal tissue adhesion, and high background noise. To address these, this study introduces a lung tumor segmentation method based on Transformer and attention mechanisms. In the Transformer encoder stage, both global and local attention mechanisms are incorporated to enable the network to simultaneously focus on both global and local contextual information. In the skip connection stage, a channel-prior convolutional attention mechanism is utilized to enhance the spatial perception ability for complex lesions and reduce the channel dimension redundancy, such that the tumor segmentation accuracy can be improved. The experimental results on the private GDPH and public LUNG1 datasets demonstrate that the proposed method outperforms eight comparative methods in terms of the Dice metric by achieving approximately 90.96% and 88.18% on the two datasets, respectively. The proposed method can provide reliable assistance for clinical diagnosis and treatment.
Open Access
Issue
Deep learning plays an essential role in the segmentation of pathological images. However, most existing deep learning methods still face challenges such as poor segmentation performance and generalization ability on multi-scale pathological image segmentation tasks. To address these issues, we propose a pathological image segmentation network based on multi-scale convolution and attention mechanisms. We design a multi-scale convolution attention module to extract different scales of features and spatially capture global contextual correlation information, effectively filtering redundant noise information and improving the network's generalization ability in handling multi-scale pathological image data. Additionally, we design a multi-scale feature fusion module to integrate features from different scales, enhancing the edge and fine-grained information in the feature maps and improving segmentation results. The experiments were performed on the GlaS, MoNuSeg and Lizard datasets, and the experimental results show that the Dice scores of the proposed method were 91.07%、81.00% and 79.87%, respectively, and the IoU scores were 84.13%、68.22% and 67.26%, respectively. This demonstrates that the proposed method can effectively segment pathology image, improve the segmentation accuracy, and provide a reliable basis for clinical diagnosis.
京公网安备11010802044758号