Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  • articleNo Access

    Combining Language Modeling and LSA on Greek Song “Words” for Mood Classification

    The present work presents a novel approach to song mood classification. Two language models, one absolute and one relative, are experimented with. Two distinct audio feature sets are compared against each other, and the significance of the inclusion of text stylistic features is established. Furthermore, Latent Semantic Analysis is innovatively combined with language modeling, and depicts the discriminative power of the latter. Finally, song “words” are defined in a broader sense that includes lyrics words as well as audio words, and LSA is applied to this augmented vocabulary with highly promising results. The methodology is applied to Greek songs, that are classified into one of four valence and into one of four arousal categories.

  • articleFree Access

    Optimized Feature Selection Approach with Elicit Conditional Generative Adversarial Network Based Class Balancing Approach for Multimodal Sentiment Analysis in Car Reviews

    Multimodal Sentiment Analysis (MSA) is a growing area of emotional computing that involves analyzing data from three different modalities. Gathering data from Multimodal Sentiment analysis in Car Reviews (MuSe-CaR) is challenging due to data imbalance across modalities. To address this, an effective data augmentation approach is proposed by combining dynamic synthetic minority oversampling with a multimodal elicitation conditional generative adversarial network for emotion recognition using audio, text, and visual data. The balanced data is then fed into a granular elastic-net regression with a hybrid feature selection method based on dandelion fick’s law optimization to analyze sentiments. The selected features are input into a multilabel wavelet convolutional neural network to classify emotion states accurately. The proposed approach, implemented in python, outperforms existing methods in terms of trustworthiness (0.695), arousal (0.723), and valence (0.6245) on the car review dataset. Additionally, the feature selection method achieves high accuracy (99.65%), recall (99.45%), and precision (99.66%). This demonstrates the effectiveness of the proposed MSA approach, even with three modalities of data.

  • chapterNo Access

    Exploration into Curriculum Teaching Reform of Digital Audio Production

    The course of digital audio production in the modern college curriculum teaching system is mainly oriented at students majoring in digital media and animation. However, the existing teaching pattern dominated by movie and TV appreciation is already out-of-date and does not suit the full-media teaching development in the new era. In this study about auditory language analysis of movies, we probed into the expressing ways and patterns of auditory language under the modern production techniques and conditions, discussed the important effects of auditory language on the overall effects of movies, and explored the new teaching patterns of digital audio production. Appreciation from aspects of human voice, audio and music was taught with relevant cases of teaching as example. The advancements of modern production technology brought new techniques, methods and patterns for the production and innovation of auditory language. Then a novel teaching pattern of “movie and TV teaching, case production” and starting from rationale understanding to practical application was established and used to uncover the bonding relationships among production techniques, methods and effects of auditory language that were needed by any successful movie. This new teaching pattern will offer some theoretical guidance for the teaching practice of digital autio production.

  • chapterNo Access

    A NEW CAPABILITY DESCRIPTION FOR AUDIO INFORMATION HIDING

    At present, capacity is the prevailing paradigm for covert channels and most of current researches are based on Shannon’s channel theory. With respect to steganography, however, capacity is insufficient. Ira S. Moskowitz et al propose “capability” paradigm in [1], but they had not given a feasible algorithm for “capability” description. In this paper, a new capability description for audio information hiding is proposed and a feasible mathematic method is given to determine how much information can be hiding. The proposed capability description can be used to give a prediction framework of steganographic communication in our speech steganography system.