Processing math: 100%
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  • articleNo Access

    Comparison of Convolutional Neural Network for Classifying Lung Diseases from Chest CT Images

    This paper proposes a convolutional neural network for diagnosing various lung illnesses from chest CT images based on a customized Medical Image Analysis and Detection network (MIDNet18). With simplified model building, minimal complexity, easy technique, and high-performance accuracy, the MIDNet-18 CNN architecture classifies binary and multiclass medical images. Fourteen convolutional layers, 7 pooling layers, 4 dense layers, and 1 classification layer comprise the MIDNet-18 architecture. The medical image classification process involves training, validating, and testing the MIDNet-18 model. In the Lung CT image binary class dataset, 2214 images as training set, 1800 images as validation set, and 831 as test set are considered for classifying COVID images and normal lung images. In the multiclass dataset, 6720 images as training sets belonging to 3 classes, 3360 images as validation sets and 601 images as test sets are considered for classifying COVID, cancer images and normal images. Independent sample size calculated for binary classification is 26 samples for each group. Similarly, 10 sample sizes are calculated for multiclass dataset classification keeping GPower at 80%. To validate the performance of the MIDNet18 CNN architecture, the medical images of two different datasets are compared with existing models like LeNet-5, VGG-16, VGG-19, ResNet-50. In multiclass classification, the MIDNet-18 architecture gives better training accuracy and test accuracy, while the LeNet5 model obtained 92.6% and 95.9%, respectively. Similarly, VGG-16 is 89.3% and 77.2% respectively; VGG-19 is 85.8% and 85.4%, respectively; ResNet50 is 90.6% and 99%, respectively. For binary classification, the MIDNet18 architecture gives better training accuracy and test accuracy, while the LeNet-5 model has obtained 52.3% and 54.3%, respectively. Similarly, VGG 16 is 50.5% and 45.6%, respectively; VGG-19 is 50.6% and 45.6%, respectively; ResNet-50 is 96.1% and 98.4%, respectively. The classified images are further predicted using detectron-2 model and the results identify abnormalities (cancer, COVID-19) with 99% accuracy. The MIDNET18 is significantly more accurate than LeNet5, VGG19, VGG16 algorithms and is marginally better than the RESNET50 algorithm for the given lung binary dataset (Bonferroni — one-way Anova and pairwise comparison of MIDNET, LeNet5, VGG19, VGG16, and RESNET 50 (p>0.05)). The proposed MIDNet18 model is significantly more accurate than LeNet5, VGG19, VGG16, ResNet50 algorithms in classifying the diseases for the given multiclass lung dataset (Bonferroni — one-way Anova and pairwise comparison of MIDNET18, LeNet5, VGG19, VGG16, ResNet50 (p>0.05)).

  • articleNo Access

    Multi-Modal Fusion Sign Language Recognition Based on Residual Network and Attention Mechanism

    Sign language recognition (SLR) is a useful tool for the deaf-mute to communicate with the outside world. Although many SLR methods have been proposed and have demonstrated good performance, continuous SLR (CSLR) is still challenging. Meanwhile, due to the heavy occlusions and closely interacting motions, there is a higher requirement for the real-time efficiency of CSLR. Therefore, the performance of CSLR needs further improvement. The highlights include: (1) to overcome these challenges, this paper proposes a novel video-based CSLR framework. This framework consists of three components: an OpenPose-based skeleton stream extraction module, a RGB stream extraction module, and a combination module of the BiLSTM network and the conditional hidden Markov model (CHMM) for CSLR. (2) A new residual network with Squeeze-and-Excitation blocks (SEResNet50) for video sequence feature extraction. (3) This paper combines the SEResNet50 module with the BiLSTM network to extract the feature information from video streams with different modalities. To evaluate the effectiveness of our proposed framework, experiments are conducted on two CSL datasets. The experimental results indicate that our method is superior to the methods in the literature.