Liver metastases are a common complication of liver cancer, and accurate image segmentation is crucial for diagnosis and treatment. Manual segmentation of the liver and tumors from CT images is time-consuming and subjective. While computer-aided segmentation has been widely adopted, segmenting liver metastases remains challenging due to the variability in shape, size and contrast. In this study, we propose a 2D network model (RL-RCUNet) that enhances the UNet architecture by incorporating an LSK module to address the issue of incomplete receptive fields. Additionally, an improved RF-CBAM module is added to the skip connections to optimize parameter sharing. Trained and tested on a dataset from cooperative hospitals, our model demonstrates accurate segmentation of liver and liver metastases from CT images.
Protecting healthcare data privacy and security is crucial in advanced manufacturing, which involves medical devices. It encompasses patient records and clinical trial data. Federated learning emerges as a solution that enables model training across different institutions without compromising data privacy and security. However, existing frameworks often exhibit a bias towards clients with larger data volumes, neglecting the connection between global and local model performance. This can result in suboptimal aggregation of the global model, thereby affecting the effectiveness and efficiency of the overall process. To address these limitations, we propose a performance evaluation-driven federated learning framework (PedFed). The primary objective of PedFed is to enhance global model aggregation and improve communication efficiency. Our approach involves a client selection strategy based on performance evaluation of local and global models. Specifically, we introduce the concept of local model improvement (LMI) using Intersection over Union (IoU) for client selection in medical image segmentation scenarios. Moreover, we introduce a dynamic aggregation framework incorporating validation IoU as a weighting factor to mitigate model divergence caused by not independent and identically distributed (non-IID) data. We focus on performing image segmentation tasks to simulate the analysis of sensitive data in the healthcare domain. Experimental results conducted on brain tumor and heart segmentation datasets demonstrate the superiority of the PedFed framework over the baseline framework, confirming its benefits in communication efficiency.
Semi-supervised learning reduces overfitting and facilitates medical image segmentation by regularizing the learning of limited well-annotated data with the knowledge provided by a large amount of unlabeled data. However, there are many misuses and underutilization of data in conventional semi-supervised methods. On the one hand, the model will deviate from the empirical distribution under the training of numerous unlabeled data. On the other hand, the model treats labeled and unlabeled data differently and does not consider inter-data information. In this paper, a semi-supervised method is proposed to exploit unlabeled data to further narrow the gap between the semi-supervised model and its fully-supervised counterpart. Specifically, the architecture of the proposed method is based on the mean-teacher framework, and the uncertainty estimation module is improved to impose constraints of consistency and guide the selection of feature representation vectors. Notably, a voxel-level supervised contrastive learning module is devised to establish a contrastive relationship between feature representation vectors, whether from labeled or unlabeled data. The supervised manner ensures that the network learns the correct knowledge, and the dense contrastive relationship further extracts information from unlabeled data. The above overcomes data misuse and underutilization in semi-supervised frameworks. Moreover, it favors the feature representation with intra-class compactness and inter-class separability and gains extra performance. Extensive experimental results on the left atrium dataset from Atrial Segmentation Challenge demonstrate that the proposed method has superior performance over the state-of-the-art methods.
A practical problem in supervised deep learning for medical image segmentation is the lack of labeled data which is expensive and time-consuming to acquire. In contrast, there is a considerable amount of unlabeled data available in the clinic. To make better use of the unlabeled data and improve the generalization on limited labeled data, in this paper, a novel semi-supervised segmentation method via multi-task curriculum learning is presented. Here, curriculum learning means that when training the network, simpler knowledge is preferentially learned to assist the learning of more difficult knowledge. Concretely, our framework consists of a main segmentation task and two auxiliary tasks, i.e. the feature regression task and target detection task. The two auxiliary tasks predict some relatively simpler image-level attributes and bounding boxes as the pseudo labels for the main segmentation task, enforcing the pixel-level segmentation result to match the distribution of these pseudo labels. In addition, to solve the problem of class imbalance in the images, a bounding-box-based attention (BBA) module is embedded, enabling the segmentation network to concern more about the target region rather than the background. Furthermore, to alleviate the adverse effects caused by the possible deviation of pseudo labels, error tolerance mechanisms are also adopted in the auxiliary tasks, including inequality constraint and bounding-box amplification. Our method is validated on ACDC2017 and PROMISE12 datasets. Experimental results demonstrate that compared with the full supervision method and state-of-the-art semi-supervised methods, our method yields a much better segmentation performance on a small labeled dataset. Code is available at https://github.com/DeepMedLab/MTCL.
Three-dimensional (3D) medical image segmentation plays a crucial role in medical care applications. Although various two-dimensional (2D) and 3D neural network models have been applied to 3D medical image segmentation and achieved impressive results, a trade-off remains between efficiency and accuracy. To address this issue, a novel mixture convolutional network (MixConvNet) is proposed, in which traditional 2D/3D convolutional blocks are replaced with novel MixConv blocks. In the MixConv block, 3D convolution is decomposed into a mixture of 2D convolutions from different views. Therefore, the MixConv block fully utilizes the advantages of 2D convolution and maintains the learning ability of 3D convolution. It acts as 3D convolutions and thus can process volumetric input directly and learn intra-slice features, which are absent in the traditional 2D convolutional block. By contrast, the proposed MixConv block only contains 2D convolutions; hence, it has significantly fewer trainable parameters and less computation budget than a block containing 3D convolutions. Furthermore, the proposed MixConvNet is pre-trained with small input patches and fine-tuned with large input patches to improve segmentation performance further. In experiments on the Decathlon Heart dataset and Sliver07 dataset, the proposed MixConvNet outperformed the state-of-the-art methods such as UNet3D, VNet, and nnUnet.
Deep neural networks (DNNs) have emerged as a prominent model in medical image segmentation, achieving remarkable advancements in clinical practice. Despite the promising results reported in the literature, the effectiveness of DNNs necessitates substantial quantities of high-quality annotated training data. During experiments, we observe a significant decline in the performance of DNNs on the test set when there exists disruption in the labels of the training dataset, revealing inherent limitations in the robustness of DNNs. In this paper, we find that the neural memory ordinary differential equation (nmODE), a recently proposed model based on ordinary differential equations (ODEs), not only addresses the robustness limitation but also enhances performance when trained by the clean training dataset. However, it is acknowledged that the ODE-based model tends to be less computationally efficient compared to the conventional discrete models due to the multiple function evaluations required by the ODE solver. Recognizing the efficiency limitation of the ODE-based model, we propose a novel approach called the nmODE-based knowledge distillation (nmODE-KD). The proposed method aims to transfer knowledge from the continuous nmODE to a discrete layer, simultaneously enhancing the model’s robustness and efficiency. The core concept of nmODE-KD revolves around enforcing the discrete layer to mimic the continuous nmODE by minimizing the KL divergence between them. Experimental results on 18 organs-at-risk segmentation tasks demonstrate that nmODE-KD exhibits improved robustness compared to ODE-based models while also mitigating the efficiency limitation.
With the rapid advancement of deep learning, computer-aided diagnosis and treatment have become crucial in medicine. UNet is a widely used architecture for medical image segmentation, and various methods for improving UNet have been extensively explored. One popular approach is incorporating transformers, though their quadratic computational complexity poses challenges. Recently, State-Space Models (SSMs), exemplified by Mamba, have gained significant attention as a promising alternative due to their linear computational complexity. Another approach, neural memory Ordinary Differential Equations (nmODEs), exhibits similar principles and achieves good results. In this paper, we explore the respective strengths and weaknesses of nmODEs and SSMs and propose a novel architecture, the nmSSM decoder, which combines the advantages of both approaches. This architecture possesses powerful nonlinear representation capabilities while retaining the ability to preserve input and process global information. We construct nmSSM-UNet using the nmSSM decoder and conduct comprehensive experiments on the PH2, ISIC2018, and BU-COCO datasets to validate its effectiveness in medical image segmentation. The results demonstrate the promising application value of nmSSM-UNet. Additionally, we conducted ablation experiments to verify the effectiveness of our proposed improvements on SSMs and nmODEs.
This paper shows an application of fuzzy information granulation (fuzzy IG) to medical image segmentation. Fuzzy IG is to derive fuzzy granules from information. In the case of medical image segmentation, information and fuzzy granules correspond to an image taken from a medical scanner, and anatomical parts, namely region of interests (ROIs), respectively. The proposed method to granulate information is composed of volume quantization and fuzzy merging. Volume quantization is to gather similar neighboring voxels. The generated quanta are selectively merged according to degrees for pre-defined fuzzy models that represent anatomical knowledge of medical images. The proposed method was applied to blood vessel extraction from three-dimensional time-of-flight (TOF) magnetic resonance angiography (MRA) images of the brain. The volume data studied in this work is composed of about 100 contiguous and volumetric MRA images. According to the fuzzy IG concept, information correspond to the volume data, fuzzy granules corresponds to the blood vessels and fat. The qualitative evaluation by a physician was done for two- and three-dimensional images generated from the obtained blood vessels. The evaluation shows that the method can segment MRA volume data, and that fuzzy IG is applicable to, and suitable for medical image segmentation.
In this paper, the standard hard C-means (HCM) clustering approach to image segmentation is modified by incorporating weighted membership Kullback–Leibler (KL) divergence and local data information into the HCM objective function. The membership KL divergence, used for fuzzification, measures the proximity between each cluster membership function of a pixel and the locally-smoothed value of the membership in the pixel vicinity. The fuzzification weight is a function of the pixel to cluster-centers distances. The used pixel to a cluster-center distance is composed of the original pixel data distance plus a fraction of the distance generated from the locally-smoothed pixel data. It is shown that the obtained membership function of a pixel is proportional to the locally-smoothed membership function of this pixel multiplied by an exponentially distributed function of the minus pixel distance relative to the minimum distance provided by the nearest cluster-center to the pixel. Therefore, since incorporating the locally-smoothed membership and data information in addition to the relative distance, which is more tolerant to additive noise than the absolute distance, the proposed algorithm has a threefold noise-handling process. The presented algorithm, named local data and membership KL divergence based fuzzy C-means (LDMKLFCM), is tested by synthetic and real-world noisy images and its results are compared with those of several FCM-based clustering algorithms.
Automatic and accurate segmentation of tumor area from rectal CT image plays an extremely key role in the treatment and diagnosis of rectal cancer. This paper proposes the MR-U-Net network model. The improvement is that a pair of encoder and decoder is added longitudinally to the U-shaped structure, which is the network structure of the fifth layer, and a residual module is added horizontally to the encoder and decoder of each layer. This model is used to conduct targeted research on the automatic segmentation method of rectal cancer. [H. Gao et al., Rectal tumor segmentation method based on U-Net improved model, J. Comput. Appl.40(8) (2020) 2392–2397] also improved U-Net and used the same dataset as this paper, but the Dice coefficient of all targets was only 83.15%, and the Dice coefficient of small targets was only 87.17%. This paper evaluates the improved MR-U-Net network model with the three indicators of precision, recall and Dice coefficient, and finds that in comparison to Ref. 4 the precision is 95.13%, 2.29% higher than the former work, recall is 94.28%, higher than the former work by 0.34%, Dice coefficient of all targets is 88.45%, increased by 5.3% compared with the former work, and the small targets Dice coefficient is increased by 1.28%, which is the best optimization state of this paper. Experiments show that for datasets with extremely skewed positive and negative samples, the MR-U-Net network structure after improving the hyperparameters in the optimizer can more accurately segment the rectal CT tumor lesion area.
Purpose: Accurate segmentation of medical images is critical for disease diagnosis, surgical planning and prognostic assessment. TransUNet, a hybrid CNN-Transformer-based method, extracts local features using CNN and compensates for the lack of long-range dependencies through a self-attention mechanism. However, the initial focus on extracting local features from specific regions impacts the generation of subsequent global features, thus constraining the model’s capacity to effectively capture a broader range of semantic information. Effective integration of local and global features plays a pivotal role in achieving precise and dense prediction. Therefore, we propose a novel hybrid CNN-Transformer-based method aimed at enhancing medical image segmentation.
Approach: In this study, a dual-encoder parallel structure is used to enhance the feature representation of the input image. By introducing a multi-scale adaptive feature fusion module, a fine fusion of local features across perceptual domains is realized in the decoding process. The generalized convolutional block attention module helps to increase cross-channel interactions in layers with more channels, thus enabling the fusion of local features and global representations at different resolutions during the decoding process.
Results: The proposed method achieves average DSC scores of 79.98%, 84.83% and 85.78% on the Synapse, ISIC2017 and Pediatric Pyelonephritis datasets, respectively. These scores are 2.5%, 0.56% and 0.42% higher than those of TransUNet. The best performance of 91.66% is observed on the ACDC dataset, representing improvements of 2.46% and 7.24% compared to HiFormer and DAE-Former, respectively.
Conclusions: The experimental results show that the proposed model has a significant competitive advantage in terms of ACDC image segmentation performance.
Efficient detection of multiple inter-related surfaces representing the boundaries of objects of interest in d-D images (d ≥ 3) is important and remains challenging in many medical image analysis applications. In this paper, we study several layered net surface (LNS) problems captured by an interesting type of geometric graphs called ordered multi-column graphs in the d-D discrete space (d ≥ 3 is any constant integer). The LNS problems model the simultaneous detection of multiple mutually related surfaces in three or higher dimensional medical images. Although we prove that the d-D LNS problem (d ≥ 3) on a general ordered multi-column graph is NP-hard, the (special) ordered multi-column graphs that model medical image segmentation have the self-closure structures and thus admit polynomial time exact algorithms for solving the LNS problems. Our techniques also solve the related net surface volume (NSV) problems of computing well-shaped geometric regions of an optimal total volume in a d-D weighted voxel grid. The NSV problems find applications in medical image segmentation and data mining. Our techniques yield the first polynomial time exact algorithms for several high dimensional medical image segmentation problems. Experiments and comparisons based on real medical data showed that our LNS algorithms and software are computationally efficient and produce highly accurate and consistent segmentation results.
Nowadays, image segmentation techniques are being used in many medical applications such as tissue culture monitoring, cell counting, automatic measurement of organs, etc., for assisting doctors. However, high-level segmentation results cannot be obtained without manual annotation or prior knowledge for high variability, noise and other imaging artifacts in medical images. Furthermore, unstable and continuously changing characteristics of all human cells, tissues and organs manipulate training-based segmentation methods. Detecting appropriate contour of a region of interest and single cells from overlapping condition are extremely challenging. In this paper, we aim for a model that can detect biological structure (e.g. cell nuclei and lung contour) with their proper morphology even in overlapping or occluded condition without manual annotation or prior knowledge. We have introduced a new optimal approach for automatic medical image region segmentation. The method first clearly focuses the boundaries of all object regions in a microscopy image. Then it detects the areas by following their contours. Our model is capable of detecting and segmenting object regions from medial image using less computation effort. Our experimental results prove that our model provides better detection on several datasets of different types of medical data and ensures more than 98% segmentation rate in the case of densely connected regions.
Brain tumor segmentation from magnetic resonance (MR) image is vital for both the diagnosis and treatment of brain cancers. To alleviate noise sensitivity and improve stability of segmentation, an effective hybrid clustering algorithm combined with fast guided filter is proposed for brain tumor segmentation in this paper. Preprocessing is performed using adaptive Wiener filtering combined with a fast guided filter. Then simple linear iterative clustering (SLIC) is utilized for pre-segmentation to effectively remove scatter. During the clustering, K-means++++ and Gaussian kernel-based fuzzy C-means (K++++GKFCM) clustering algorithm are combined to segment, and the fast-guided filter is introduced into the clustering. The proposed algorithm not only improves the robustness of the algorithm to noise, but also improves the stability of the segmentation. In addition, the proposed algorithm is compared with other current segmentation algorithms. The results show that the proposed algorithm performs better in terms of accuracy, sensitivity, specificity and recall.
In the current medical image segmentation network, the combination of CNN and Transformer has become a mainstream trend. However, the inherent limitations of convolution operation in CNN and insufficient information interaction in Transformer affect the segmentation performance of the network. To solve these problems, an integrated self-attention and convolution medical image segmentation network (ISC-TransUNet) is proposed in this paper. The network consists of encoder, decoder and jump connection. First, the encoder uses a hybrid structure of BoTNet and Transformer to capture more comprehensive image information and reduce additional computing overhead. Then, the decoder uses an upper sampler cascaded by multiple DUpsampling upper blocks to accurately recover the pixel-level prediction. Finally, the feature fusion of encoder and decoder at different resolutions is realized by ResPath jump connection, which reduces the semantic difference between encoder and decoder. Through experiments on the Synapse multi-organ segmentation dataset, compared with the baseline model TransUNet, Dice similarity coefficient of ISC-TransUNet was improved by 1.13%, Hausdorff distance was reduced by 2.38%, and weight was maintained. The experimental results show that the network can effectively segment tissues and organs in medical images, which has important theoretical significance and application value for intelligent clinical diagnosis and treatment.
A key study area throughout the medical sector for image processing and analysis is medical image segmentation. The diagnosis and treatment strategies of doctors may have a solid foundation owing to accurate and effective medical image segmentation. Conventional approaches in this field rely on manual feature extraction, which makes segmentation complex, costs doctors’ time and energy, and involves a subjective evaluation that is readily susceptible to diagnostic errors. Researchers have applied convolutional neural network-based deep learning techniques to the segmentation of medical images as a result of their impressive advancements and successes in the field of computer vision. The research described here uses the U-Net network’s outstanding feature learning capabilities and end-to-end processing mode for lung CT image segmentation via fully convolutional network (FCN) research. However, focusing on valuable, crucial information aspects in the U-Net network is challenging. This study employed multilevel attention mechanisms on the basis of U-Net networks to enhance the model’s accuracy in lung CT image segmentation. These mechanisms were inspired by attention mechanisms. By improving the segmentation accuracy and optimizing the segmentation effect, the new model embeds a self-attention module in front of each upsampling layer in the U-Net model. This module provides more detailed information by stitching the self-attention module of the original image and then suppresses irrelevant and redundant information by using the effect of feature extraction of the upsampling layer. Several additional comparative experiments were conducted on the 2019nCoVR dataset. The outcomes demonstrate the efficacy of the optimized model described in this paper and its application results in improved segmentation effects in lung CT images. Additionally, the new model has distinct advantages over existing approaches that are typical of medical image segmentation, which represent its higher level of lung CT image segmentation.
In the field of medical image segmentation, deep convolutional neural networks have achieved satisfying performance in the past decade or so. However, there are some shortcomings. First, the convolutional neural network model cannot provide good insight into the remote dependencies in the image. Second, medical imaging datasets are typically small, which leads to a much higher risk of overfitting in model training. To address these limitations, we innovatively designed the skipped features enhancer (SFE) to enhance the impact of preserved details. To gain insight into remote dependencies in images, this model (SFE-TransUNet) is based on transformer. Additionally, a different scale convolutional layer (additional information capturer) before and after the Transformer Encoder to fuse the features to retain more information of the original data. In addition, a gate mechanism was introduced in the multi-head self-attention (MHSA), and. Finally, an attention block with residuals. SFE-TransUNet was evaluated on two public medical image segmentation datasets. Experimental results show that it achieves better performance than other related Transformer-based architectures. Code available at https://github.com/xackz/SFE-TransUNet.
Coronary artery disease is a prominent contributor to cardiovascular death. Automatic segmentation of Coronary Angiography (CAG) images is crucial for early diagnosis and treatment and holds significant importance in clinical diagnosis, surgical planning, and treatment evaluation. However, due to challenges such as poor image quality, complex backgrounds, and fine vessel structures, the task of automatic segmentation has always been difficult. This study aims to improve the accuracy and robustness of automatic CAG segmentation. Therefore, we propose a UNet model improved with the Convolutional Block Attention Module (CBAM), which enhances the model’s ability to capture important feature areas by combining the Channel Attention Module (CAM) and the Spatial Attention Module (SAM), making the identified vessels completer and more accurate. Specifically, the CBAM module is integrated into each convolutional layer to enhance the feature representation capabilities of the channel and space. We evaluate the CBAM-UNet based on X-ray Angiography Coronary Artery Disease (XCAD) datasets. The experimental results indicate that the improved model’s accuracy increased to 97%, and precision reached 89.95%. The results demonstrate that the improved model’s overall performance on various metrics surpasses traditional methods, with significant improvements in anti-background noise interference and handling complex vascular structures.
The paper proposes a robust and efficient model designed for multi-label abdominal organ segmentation, featuring a substantially reduced number of parameters. The model focuses on the effectiveness of edge guidance in segmentation and leverages a 3D-Unet architecture with deep supervision, incorporating the robust deep thinking gate (DTG) architecture. Our DTG-incorporated model architecture excels in both efficiency and effectiveness, demonstrating notable enhancements in multi-label abdominal organ segmentation performance. A comprehensive evaluation of the model employed on two datasets of MRI scan of BTCV and FLARE 2022, comparing its performance against state-of-the-art counterparts. The outcomes revealed that the proposed model achieved the highest dice score in the esophagus (0.795), gallbladder (0.945), and pancreas (0.87) while maintaining a most significantly reduced parameter count (13.3 million parameters count). This achievement underscores the model’s efficiency and its suitability for seamless integration into real-world applications, offering promising prospects for enhanced medical image analysis.
Please login to be able to save your searches and receive alerts for new content matching your search criteria.