World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SSTSA: A Self-Supervised Topic Sentiment Analysis Using Semantic Similarity Measures and Transformers

    https://doi.org/10.1142/S0219622023500736Cited by:4 (Source: Crossref)

    The exponentially increasing amount of data generated by the public on social media platforms is a precious source of information. It can be used to find the topics and analyze the comments. Some researchers have extended the Latent Dirichlet Allocation (LDA) method by adding a sentiment layer to simultaneously find the topics and their related sentiments. However, most of these approaches do not achieve admirable accuracy in Topic Sentiment Analysis (TSA), particularly when there is insufficient training data or the texts are complex, ambiguous, and short. In this paper, a self-supervised novel approach called SSTSA is proposed for TSA that extracts the hidden topics and analyzes the total sentiment related to each topic. The SSTSA proposes a new method called Pseudo-label Generator. For this purpose, first, it employs semantic similarity and Word Mover’s Distance (WMD) measures. Then, the document embedding technique is employed to semantically estimate the sentiment orientation of samples and generate the pseudo-labels (positive or negative). Afterward, a hybrid classifier composed of a pre-trained Robustly Optimized BERT (RoBERTa) and a Long Short-Term Memory (LSTM) model is trained to predict the sentiment of unseen data. The evaluation results on different datasets of various domains demonstrate that the SSTSA outperforms similar unsupervised/self-supervised methods.