Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Typically, deep learning models for image segmentation tasks are trained using large datasets of images annotated at the pixel level, which can be expensive and highly time-consuming. A way to reduce the amount of annotated images required for training is to adopt a semi-supervised approach. In this regard, generative deep learning models, concretely Generative Adversarial Networks (GANs), have been adapted to semi-supervised training of segmentation tasks. This work proposes MaskGDM, a deep learning architecture combining some ideas from EditGAN, a GAN that jointly models images and their segmentations, together with a generative diffusion model. With careful integration, we find that using a generative diffusion model can improve EditGAN performance results in multiple segmentation datasets, both multi-class and with binary labels. According to the quantitative results obtained, the proposed model improves multi-class image segmentation when compared to the EditGAN and DatasetGAN models, respectively, by 4.5% and 5.0%. Moreover, using the ISIC dataset, our proposal improves the results from other models by up to 11% for the binary image segmentation approach.
Almost all existing approaches for community detection only make use of the network topology information, which completely ignore the background information of the network. However, in many real world applications, we may know some prior information that could be useful in detecting the community structures. Specifically, the true community assignments of certain nodes are known in advance. In this paper, a novel semi-supervised community detection approach is proposed based on label propagation, which can utilize prior information to guide the discovery process of community structure. Our algorithm can propagate the labels from the labeled nodes to the whole network nodes. The algorithm is evaluated on several artificial and real-world networks and shows that it is highly effective in recovering communities.
In this paper, we propose two Laplacian nonparallel hyperplane proximal classifiers (LapNPPCs) for semi-supervised and full-supervised classification problem respectively by adding manifold regularization terms. Due to the manifold regularization terms, our LapNPPCs are able to exploit the intrinsic structure of the patterns of the training set. Furthermore, our classifiers only need to solve two systems of linear equations rather than two quadratic programming (QP) problems as needed in Laplacian twin support vector machine (LapTSVM) (Z. Qi, Y. Tian and Y. Shi, Neural Netw.35 (2012) 46–53). Numerical experiments on toy and UCI benchmark datasets show that the accuracy of our LapNPPCs is comparable with other classifiers, such as the standard SVM, TWSVM and LapTSVM, etc. It is also the case that based on our LapNPPCs, some other TWSVM type classifiers with manifold regularization can be constructed by choosing different norms and loss functions to deal with semi-supervised binary and multi-class classification problems.
The number of jellyfish outbreaks is on the rise around the world, and they have been considered a serious ecological disaster. As part of the emergency response plan for jellyfish disasters, in-situ detection research that can distinguish jellyfish species and quantities is urgently required to support accurate data collection. As a typical fully supervised regression task, counting is usually regarded as requiring a large number of labeled datasets in conventional counting methods. To treat counting as a few-shot regression task that is semi-supervised, a novel adaptation strategy based on deep learning is presented in this paper. The method combines the test image with several example objects from the test image and takes advantage of the strong similarities present in the test image and the example objects contained in the image. Effective counting can be achieved without training the target object. Prediction of the density map of the test image’s objects of interest is the objective of the test. This method has been shown to be more robust than the method of detection first and counting later, and its accuracy can exceed 95%.
In this paper, we propose a clustering-based semi-supervised cross-modal retrieval method to relieve the problem of insufficient annotation in cross-modal datasets. First, we reconstruct cross-modal data as scene graph structure to filter meaningless information. Second, we extract embedding representation features of images and texts to put them into a common space. Finally, we propose a clustering-based classification method with modality-independent constraint to discriminate samples. According to our experimental results, significant improvement on performance shows the accuracy of our method in terms of three widely used cross-modal datasets compared with the state-of-the-art methods.
This paper aims to propose an improved image classification model to reduce the cost of model construction. Aiming at the problem that network training usually requires the support of a large number of labeled samples, an image classification model based on semi-supervised deep learning is proposed, which uses labeled samples to guide the network to learn unlabeled samples. A convolutional neural network model for simultaneous processing of labeled and unlabeled data is constructed. The tagged data is used to train the Softmax classifier and provide the initial K-means clustering center for the untagged data. The nonsubsampling contourlet layer is used to replace the first convolutional layer of the full convolutional neural network to extract multi-scale depth features, and the nonsubsampling contourlet full convolutional neural network is constructed. The network can extract multi-scale information of the images to be classified, and extract more discriminative deep image features. In addition, the parameters of the nonsubsampled contourlet layers are pre-set and do not require network training. The proposed method has higher classification accuracy than the contrast method on polarimetric SAR images using the nonsubsampled contourlet full convolutional neural network.
One of the important technologies in present days is Intrusion detection technology. By using the machine learning techniques, researchers were developed different intrusion systems. But, the designed models toughness is affected by the two parameters, in that first one is, high network traffic imbalance in several categories, and another is, non-identical distribution is present in between the test set and training set in feature space. An artificial neural network (ANN) multi-level intrusion detection model with semi-supervised hierarchical k-means method (HSK-means) is presented in this paper. Error rate of intrusion detection is reduced by the ANN’s accurate learning so it uses the Grasshopper Optimization Algorithm (GOA) which is analysed in this paper. Based on selection of important and useful parameters as bias and weight, error rate of intrusion detection system is reduced in the GOA algorithm and this is the main objective of the proposed system. Cluster based method is used in the pattern discovery module in order to find the unknown patterns. Here the test sample is treated as unlabelled unknown pattern or the known pattern. Proposed approach performance is evaluated by using the dataset as KDDCUP99. It is evident from the experimental findings that the projected model of GOA based semi supervised learning approach is better in terms of sensitivity, specificity and overall accuracy than the intrusion systems which are existed previously.
Sleep staging with supervised learning requires a large amount of labeled data that are time-consuming and expensive to collect. Semi-supervised learning is widely used to improve classification performance by combining a small amount of labeled data with a large amount of unlabeled data. The accuracy of pseudo-labels in semi-supervised learning may influence the performance of classifier. Based on semi-supervised sparse representation classification, this study proposed an improved sparse concentration index to estimate the confidence of pseudo-labels data for sleep EEG recognition considering both interclass differences and intraclass concentration. In view of class imbalance in sleep EEG data, the synthetic minority oversampling technique was also improved to remove mixed samples at the boundary between minority and majority classes. The results showed that the proposed method achieved better classification performance, in which the classification accuracy after class balancing was obviously higher than that before class balancing. The findings of this study will be beneficial for application in sleep monitoring devices and sleep-related diseases.
Traditional supervised dimensionality reduction methods can establish a better model often under the premise of a large number of samples. However, in real-world applications where labeled data are scarce, traditional methods tend to perform poorly because of overfitting. In such cases, unlabeled samples could be useful in improving the performance. In this paper, we propose a semi-supervised dimensionality reduction method by using partial least squares (PLS) which we call semi-supervised partial least squares (S2PLS). To combine the labeled and unlabeled samples into a S2PLS model, we first apply the PLS algorithm to unsupervised dimensionality reduction. Then, the final S2PLS model is established by ensembling the supervised PLS model and the unsupervised PLS model which using the basic idea of principal model analysis (PMA) method. Compared with unsupervised or supervised dimensionality reduction algorithms, S2PLS not only can improve the prediction accuracy of the samples but also enhance the generalization ability of the model. Meanwhile, it can obtain better results even there are only a few or no labeled samples. Experimental results on five UCI data sets also confirmed the above properties of S2PLS algorithm.
Segmentation of natural textures has been investigated by developing a novel semi-supervised support vector machines (S3VM) algorithm with multiple constraints. Unlike conventional segmentation algorithms the proposed method does not classify the textures but classifies the uniform-texture regions and the regions of boundaries. Also the overall algorithm does not use any training set as used by all other learning algorithms such as conventional SVMs. During the process, the images are restored from high spatial frequency noise. Then various-order statistics of the textures within a sliding two-dimensional window are measured. K-mean algorithm is used to initialise the clustering procedure by labelling part of the class members and the classifier parameters. Therefore at this stage we have both the training and the working sets. A non-linear S3VM is then developed to exploit both sets to classify all the regions. The convex algorithm maximises a defined cost function by incorporating a number of constraints. The algorithm has been applied to combinations of a number of natural textures. It is demonstrated that the algorithm is robust, with negligible misclassification error. However, for complex textures there may be a minor misplacement of the edges.
The lack of diversity in genomic datasets, currently skewed towards individuals of European ancestry, presents a challenge in developing inclusive biomedical models. The scarcity of such data is particularly evident in labeled datasets that include genomic data linked to electronic health records. To address this gap, this paper presents PopGenAdapt, a genotype-to-phenotype prediction model which adopts semi-supervised domain adaptation (SSDA) techniques originally proposed for computer vision. PopGenAdapt is designed to leverage the substantial labeled data available from individuals of European ancestry, as well as the limited labeled and the larger amount of unlabeled data from currently underrepresented populations. The method is evaluated in underrepresented populations from Nigeria, Sri Lanka, and Hawaii for the prediction of several disease outcomes. The results suggest a significant improvement in the performance of genotype-to-phenotype models for these populations over state-of-the-art supervised learning methods, setting SSDA as a promising strategy for creating more inclusive machine learning models in biomedical research.
Our code is available at https://github.com/AI-sandbox/PopGenAdapt.
Cancer is a complex collection of diseases that are to some degree unique to each patient. Precision oncology aims to identify the best drug treatment regime using molecular data on tumor samples. While omics-level data is becoming more widely available for tumor specimens, the datasets upon which computational learning methods can be trained vary in coverage from sample to sample and from data type to data type. Methods that can ‘connect the dots’ to leverage more of the information provided by these studies could offer major advantages for maximizing predictive potential. We introduce a multi-view machinelearning strategy called PLATYPUS that builds ‘views’ from multiple data sources that are all used as features for predicting patient outcomes. We show that a learning strategy that finds agreement across the views on unlabeled data increases the performance of the learning methods over any single view. We illustrate the power of the approach by deriving signatures for drug sensitivity in a large cancer cell line database. Code and additional information are available from the PLATYPUS website https://sysbiowiki.soe.ucsc.edu/platypus.
Compared to the dimension of face image samples, the number and face image is relatively small. The face recognition problem is essentially a small sample learning problem. Aiming at the small sample problem, in this paper, we propose the method of self-training for margin neighbor. Using the margin to represent decision confidence, and using the spatial adjacent to represent gradual change of face manifold, through self-training iteration, the sample distance of the same classifications is as compact as possible, the sample distance of the different classifications maintain a certain large distance. In the neighborhood, constantly mark the unlabeled samples of high credibility. Experiments show that, compared to other methods, self-training for large margin neighbor has relatively better recognition in small face samples.