Please login to be able to save your searches and receive alerts for new content matching your search criteria.
This paper proposes a convolutional neural network for diagnosing various lung illnesses from chest CT images based on a customized Medical Image Analysis and Detection network (MIDNet18). With simplified model building, minimal complexity, easy technique, and high-performance accuracy, the MIDNet-18 CNN architecture classifies binary and multiclass medical images. Fourteen convolutional layers, 7 pooling layers, 4 dense layers, and 1 classification layer comprise the MIDNet-18 architecture. The medical image classification process involves training, validating, and testing the MIDNet-18 model. In the Lung CT image binary class dataset, 2214 images as training set, 1800 images as validation set, and 831 as test set are considered for classifying COVID images and normal lung images. In the multiclass dataset, 6720 images as training sets belonging to 3 classes, 3360 images as validation sets and 601 images as test sets are considered for classifying COVID, cancer images and normal images. Independent sample size calculated for binary classification is 26 samples for each group. Similarly, 10 sample sizes are calculated for multiclass dataset classification keeping GPower at 80%. To validate the performance of the MIDNet18 CNN architecture, the medical images of two different datasets are compared with existing models like LeNet-5, VGG-16, VGG-19, ResNet-50. In multiclass classification, the MIDNet-18 architecture gives better training accuracy and test accuracy, while the LeNet5 model obtained 92.6% and 95.9%, respectively. Similarly, VGG-16 is 89.3% and 77.2% respectively; VGG-19 is 85.8% and 85.4%, respectively; ResNet50 is 90.6% and 99%, respectively. For binary classification, the MIDNet18 architecture gives better training accuracy and test accuracy, while the LeNet-5 model has obtained 52.3% and 54.3%, respectively. Similarly, VGG 16 is 50.5% and 45.6%, respectively; VGG-19 is 50.6% and 45.6%, respectively; ResNet-50 is 96.1% and 98.4%, respectively. The classified images are further predicted using detectron-2 model and the results identify abnormalities (cancer, COVID-19) with 99% accuracy. The MIDNET18 is significantly more accurate than LeNet5, VGG19, VGG16 algorithms and is marginally better than the RESNET50 algorithm for the given lung binary dataset (Bonferroni — one-way Anova and pairwise comparison of MIDNET, LeNet5, VGG19, VGG16, and RESNET 50 (p>0.05)). The proposed MIDNet18 model is significantly more accurate than LeNet5, VGG19, VGG16, ResNet50 algorithms in classifying the diseases for the given multiclass lung dataset (Bonferroni — one-way Anova and pairwise comparison of MIDNET18, LeNet5, VGG19, VGG16, ResNet50 (p>0.05)).
Sign language recognition (SLR) is a useful tool for the deaf-mute to communicate with the outside world. Although many SLR methods have been proposed and have demonstrated good performance, continuous SLR (CSLR) is still challenging. Meanwhile, due to the heavy occlusions and closely interacting motions, there is a higher requirement for the real-time efficiency of CSLR. Therefore, the performance of CSLR needs further improvement. The highlights include: (1) to overcome these challenges, this paper proposes a novel video-based CSLR framework. This framework consists of three components: an OpenPose-based skeleton stream extraction module, a RGB stream extraction module, and a combination module of the BiLSTM network and the conditional hidden Markov model (CHMM) for CSLR. (2) A new residual network with Squeeze-and-Excitation blocks (SEResNet50) for video sequence feature extraction. (3) This paper combines the SEResNet50 module with the BiLSTM network to extract the feature information from video streams with different modalities. To evaluate the effectiveness of our proposed framework, experiments are conducted on two CSL datasets. The experimental results indicate that our method is superior to the methods in the literature.