Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Many problems in NLP such as language translation and sentiment analysis have shown a lot of improvement in recent years. As simpler language problems are solved or better understood, the focus shifts to more complex problems such as semantic analysis and understanding. Unfortunately, a lot of studies in the literature suffer from a too much specificity problem. The algorithms and datasets are too domain specific. In this study, we analyze and elaborate on this notion of generality. Instead of selecting a highly specialized data set for semantic analysis, we take a generic and possibly dry data set, and we study how a plain vanilla Transformer performs in learning higher level semantic patterns beyond what was obvious or expected. We tune our Transformer model on a classic language task to ensure correct performance. Once tuned, the goal is to select sentences with specific key words and study whether higher level semantic patterns may have been learned by our model. We believe that we obtained promising results. The average BLEU score for sentences less than 25 words is equal to 39.79. Our initial qualitative analysis of possible semantic content of interest shows a 17 percent rate in finding interesting semantic patterns. We provide discussion of data driven results of unexpectedness as a measure of semantic learning.
In recent years, the development of deep learning has contributed to various areas of machine learning. However, deep learning requires a huge amount of data to train the model, and data collection techniques such as web crawling can easily generate incorrect labels. If a training dataset has noisy labels, the generalization performance of deep learning significantly decreases. Some recent works have successfully divided the dataset into samples with clean labels and ones with noisy labels. In light of these studies, we propose a novel data expansion framework to robustly train the models on noisy labels with the attention mechanisms. First, our method trains a deep learning model with the sample selection approach and saves the samples selected as clean at the end of training. The original noisy dataset is then extended with the selected samples and the model is trained on the dataset again. To prevent over-fitting and allow the model to learn different patterns of the selected samples, we leverage the attention mechanism of deep learning to modify the representation of the selected samples. We evaluated our method with synthetic noisy labels on CIFAR-10 and CUB-200-2011 and real-world dataset Clothing1M. Our method obtained comparable results to baseline CNNs and state-of-the-art methods.
In recent years, heterogeneous network/graph representation learning/embedding (HNE) has drawn tremendous attentions from research communities in multiple disciplines. HNE has shown its outstanding performances in various networked data analysis and mining tasks. In fact, most of real-world information networks in multiple fields can be modelled as the heterogeneous information networks (HIN). Thus, the HNE-based techniques can sufficiently capture rich-structured and semantic latent features from the given information network in order to facilitate for different task-driven learning tasks. This is considered as fundamental success of HNE-based approach in comparing with previous traditional homogeneous network/graph based embedding techniques. However, there are recent studies have also demonstrated that the heterogeneous network/graph modelling and embedding through graph neural network (GNN) is not usually reliable. This challenge is original come from the fact that most of real-world heterogeneous networks are considered as incomplete and normally contain a large number of feature noises. Therefore, multiple attempts have proposed recently to overcome this limitation. Within this approach, the meta-path-based heterogeneous graph-structured latent features and GNN-based parameters are jointly learnt and optimized during the embedding process. However, this integrated GNN and heterogeneous graph structure (HGS) learning approach still suffered a challenge of effectively parameterizing and fusing different graph-structured latent features from both GNN- and HGS-based sides into better task-driven friendly and noise-reduced embedding spaces. Therefore, in this paper we proposed a novel attention-supplemented heterogeneous graph structure embedding approach, called as: AGSE. Our proposed AGSE model supports to not only achieve the combined rich heterogeneous structural and GNN-based aggregated node representations but also transform achieved node embeddings into noise-reduced and task-driven friendly embedding space. Extensive experiments in benchmark heterogeneous networked datasets for node classification task showed the effectiveness of our proposed AGSE model in comparing with state-of-the-art network embedding baselines.
This article uses theoretical approaches from cognitive psychology to examine the basis for entrepreneurial alertness and to connect it to existing theories of attention in strategic management and decision-making. It thereby provides a theoretical basis for understanding how entrepreneurial alertness leads the individual to pay attention to new opportunities. A model is developed to show how attention and entrepreneurial alertness work together to support the recognition or creation of opportunities. Entrepreneurial alertness is believed to be a manifestation of differences in the schemata and cognitive frameworks that individuals use to make sense of changes in the environment. This suggests that entrepreneurial alertness mediates the impact of observed phenomena upon the situated attention of individual decision-makers.
Presently, a significant number of students enrolled in colleges and universities encounter mental health challenges, including academic stress and difficulties in interpersonal relationships. These factors contribute to a decline in mental well-being and necessitate prompt intervention and assistance. The automation of mental health identification in students greatly benefits from the implementation of intelligent emotion recognition systems. Through the analysis of students’ emotional states, educational institutions can gain a deeper understanding of students’ mental well-being, enabling them to promptly identify and address any issues and offer more complete care and support to students. This paper proposes a method for automated emotion recognition that is driven by data. The method is developed based on the provided background information. The proposed approach incorporates a feature fusion mechanism and an attention mechanism into the ResNet-34 model. This integration enhances the model’s capability to analyze intricate details and subsequently improves its classification performance when applied to electroencephalography (EEG) signals. This paper introduces several key innovations. First, an optimization technique is applied to the input component of the model, enabling multifeature fusion. Second, an attention mechanism is incorporated after the residual network module, enabling the model to prioritize parts that contribute to classification and enhance feature extraction. Finally, the network parameters are optimized using both softmax loss and center loss functions. The findings from the analysis of the sentiment EEG public data SJTU Emotion EEG Dataset (SEED) indicate that the proposed sentiment recognition approach not only enhances the classification performance of the model on the sentiment EEG data but also improves the stability of the results. This paper presents a novel approach that enables automatic and efficient recognition of students’ emotions on commonly used platforms. The findings of this study hold significant implications for mental health assessment and detection in real-life production settings, offering substantial reference value in this domain.
What makes Question & Answer (Q&A) communities productive? In this paper, we look into how the diversity of behavioral types of agents impacts the performance of Q&A communities using different performance metrics. We do this by developing an agent-based model informed by insights from previous studies on Q&A communities. By analyzing the different strategies for how questions are selected to answer, we find that there are mixtures of strategies leading to the best outcomes for different performance conditions. Particularly, Q&A communities that encourage participants to focus on answering the new questions reach the best performance in answering the questions, creating the long term value, and improving the competence of solving difficulty. In conclusion, we find that the current strategies of question selection on Stack Overflow are in line with the high performance of producing public benefit from the collective attention available.
It is desirable to prevent traffic accidents by focusing on elderly people’s brain characteristics. The attention level during driving depends on the amount of information-processing resources. This study first aimed at investigating the effects of the change in attention levels on the electroencephalogram (EEG) waves during the graded working memory tasks for a traffic situation. With the increase in memory loads, reaction times were delayed in the elderly than the young group. The difficult tasks activated the induced δ and θ powers in the frontal midline area primarily in the elderly, during the selective task for a target. The elderly could retain the attention level because of the activated slow EEG responses, regardless of the task performance, although the increased δ wave may reflect drowsiness. Because the assistance system based on drivers’ brain signals can prevent car accidents, this study also aimed at evaluating the analytical method to automatically discriminate the different attentional tasks from the EEG signals. Compared with k-nearest neighbors and artificial neural networks, support vector machines more accurately classified attention levels (i.e., task difficulty) during working memory tasks reflecting a change in the induced δ and θ waves. This result can be related to a brain-computer interface system to judge the task difficulty during driving and alert a driver to danger. The experimental tasks for this study were limited because they involved simulations only in which participants recognized guided boards and removed irrelevant information. Real-time judgments should be investigated using EEG data to improve systems that can alert drivers to oncoming dangers.
Posttraumatic Stress Disorder (PTSD) is characterized by symptoms of hyperarousal, avoidance and intrusive trauma-related memories and deficits in everyday memory and attention. Separate studies in PTSD have found abnormalities in electroencephalogram EEG, in event-related potential (ERP) and behavioral measures of working memory and attention. The present study seeks to determine whether these abnormalities are related and the extent to which they share this relationship with clinical symptoms. EEG data were collected during an eyes-open paradigm and a one-back working memory task. Behavioral and clinical data (CAPS) were also collected. The PTSD group showed signs of altered cortical arousal as indexed by reduced alpha power and an increased theta/alpha ratio, and clinical and physiological measures of arousal were found to be related. The normal relationship between theta power and ERP indices of working memory was not affected in PTSD, with both sets of measures reduced in the disordered group. Medication appeared to underpin a number of abnormal parameters, including P3 amplitude to targets and the accuracy, though not speed, of target detection. The present study helps to overcome a limitation of earlier studies that assess such parameters independently in different groups of patients that vary in factors such as comorbidity, medication status, gender and symptom profile. The present study begins to shed light on the relationship between these measures and suggests that abnormalities in brain working memory may be linked to underlying abnormalities in brain stability.
Aims: To distinguish the most sensitive markers of methylphenidate (MPH) effects on behavior and underlying biology using an integrated cognitive and brain function test battery.
Methods: A randomized placebo-controlled trial with 32 healthy adult males. Subjects were tested on MPH doses across 18 sessions with subjective mood, objective behavioral and biological endpoints. From a computerized battery of tests, behavioral measures were cognitive performance scores, while biological measures of brain function included electroencephalographs (EEG) and event-related potentials (ERPs) with complementary measures of autonomic arousal. Using mixed modeling analyses; we determined which measures were most affected by MPH dose and correlation analyses determined the associations among them.
Results: MPH dose had the most pronounced effect on cognitive performance (sustained attention/vigilance), baseline autonomic arousal (heart rate, blood pressure) and baseline brain activity (EEG theta power). The faster reaction time, reduced errors, increased autonomic arousal and reductions in theta showed strong to moderate inter-correlations. MPH least affected subjective mood measures and early sensory ERP components.
Discussion: These findings suggest that MPH increases cortical and autonomic arousal, facilitating vigilance. The combination of behavioral and biological measures may provide an objective set of markers of MPH response.
Integrative Significance: This approach has provided additional insight into the mechanism of the stimulant medication, MPH, which would not be achieved by using such measures in isolation.
This paper argues that mechanisms underlying consciousness and qualia are likely to arise from the information processing that takes place within the detailed micro-structure of the cerebral cortex. It looks at two key issues: how any information processing system can recognize its own activity; and secondly, how this behavior could lead to the subjective experience of qualia. In particular, it explores the pattern processing capabilities of attractor networks, and the way that they can attribute meaning to their input patterns and goes on to show how these capabilities can lead to self-recognition. The paper suggests that although feedforward processing of information can be effective without attractor behavior, when such behavior is initiated, it would lead to self-recognition in those networks involved. It also argues that attentional mechanisms are likely to play a key role in enabling attractor behavior to take place. The paper explores the ability of attractor networks to generate representations of the meaning they assign to input patterns. It goes on to show how the way that they interpret representations of their own activity could give rise to qualia. The paper includes an examination of some limited neurobiological evidence that supports the theory outlined.
In order to investigate the search performance and strategies of nonhuman primates, two macaque monkeys were trained to search for a target template among differently oriented distractors in both free-gaze and fixed-gaze viewing conditions (overt and covert search). In free-gaze search, reaction times (RT) and eye movements revealed the theoretically predicted characteristics of exhaustive and self-terminating serial search, with certain exceptions that are also observed in humans. RT was linearly related to the number of fixations but not necessarily to the number of items on display. Animals scanned the scenes in a nonrandom manner spending notably more time on targets and items inspected last (just before reaction). The characteristics of free-gaze search were then compared with search performance under fixed gaze (covert search) and with the performance of four human subjects tested in similar experiments. By and large the performance characteristics of both groups were similar; monkeys were slightly faster, and humans more accurate. Both species produced shorter RT in fixed-gaze than in free-gaze search. But while RT slopes of the human subjects still showed the theoretically predicted difference between hits and rejections, slopes of the two monkeys appeared to collapse. Despite considerable priming and short-term learning when similar tests were continuously repeated, no substantial long-term training effects were seen when test conditions and set sizes were frequently varied. Altogether, the data reveal many similarities between human and monkey search behavior but indicate that search is not necessarily restricted to exclusively serial processes.
Emotional stimuli generally command more brain processing resources than non-emotional stimuli, but the magnitude of this effect is subject to voluntary control. Cognitive reappraisal represents one type of emotion regulation that can be voluntarily employed to modulate responses to emotional stimuli. Here, the late positive potential (LPP), a specific event-related brain potential (ERP) component, was measured in response to neutral, positive and negative images while participants performed an evaluative categorization task. One experimental group adopted a "negative frame" in which images were categorized as negative or not. The other adopted a "positive frame" in which the exact same images were categorized as positive or not. Behavioral performance confirmed compliance with random group assignment, and peak LPP amplitude to negative images was affected by group membership: brain responses to negative images were significantly reduced in the "positive frame" group. This suggests that adopting a more positive appraisal frame can modulate brain activity elicited by negative stimuli in the environment.
The ability to dynamically track moving objects in the environment is crucial for efficient interaction with the local surrounds. Here, we examined this ability in the context of the multi-object tracking (MOT) task. Several theories have been proposed to explain how people track moving objects; however, only one of these previous theories is implemented in a real-time process model, and there has been no direct contact between theories of object tracking and the growing neural literature using ERPs and fMRI. Here, we present a neural process model of object tracking that builds from a Dynamic Field Theory of spatial cognition. Simulations reveal that our dynamic field model captures recent behavioral data examining the impact of speed and tracking duration on MOT performance. Moreover, we show that the same model with the same trajectories and parameters can shed light on recent ERP results probing how people distribute attentional resources to targets vs. distractors. We conclude by comparing this new theory of object tracking to other recent accounts, and discuss how the neural grounding of the theory might be effectively explored in future work.
Simple geometric and organic shapes and their arrangement are being used in different neuropsychology tests for the assessment of cognitive function, special memory and also for the therapy purpose in different patient groups. Until now there is no electrophysiological evidence of cognitive function determination for simple geometric, organic shapes and their arrangement. Then the main objective of this study is to know the cortical processing and amplitude, latency of visual induced N170 and P300 event related potential components on different geometric, organic shapes and their arrangement and different educational influence on it, which is worthwhile to know for the early and better treatment for those patient groups. While education influenced on cognitive function by using auditory oddball task, little is known about the influence of education on cognitive function induced by visual attention task in case of the choice of geometric, organic shapes and their arrangements. Using a 128-electrode sensor net, we studied the responses of the choice of the different geometric and organic shapes randomly in experiment 1 and their arrangements in experiment 2 in the high, medium and low education groups. In both experiments, subjects push the button "1" or "2" if like or dislike, respectively. Total 45 healthy subjects (15 in each group) were recruited. ERPs were measured from 11 electrode sites and analyzed to see the evoked N170/N240 and P300 ERP components. There were no differences between like and dislike in amplitudes even in latencies in every stimulus in both experiments. We fixed geometric shapes and organic shapes stimuli only, not like and dislike. Upon the stimulus types, N170 ERP component was found instead of N240, in occipito-temporal (T5, T6, O1 and O2) locations where the amplitude is the highest at O2 location and P300 was distributed in the central (Cz and Pz) locations in both experiments in all groups. In experiment 1, significant low amplitude and non-significant larger latency of the N170 component are found out at O1 location for both stimuli in low education group comparing medium education groups, but in experiment 2, there is no significant difference between stimuli among groups in amplitude and latency. In both experiments, P300 component was found in Cz and Pz locations though the amplitudes are higher at Cz than Pz areas. In experiment 1, medium education group evoked significantly (geometric shape stimuli, P = 0.05; organic shape stimuli, P = 0.02) higher amplitude of P300 component comparing low education group at Cz location. Whereas, there is no significant difference of amplitudes among groups across stimuli in Cz and Pz locations in experiment 2. Latencies have no significant differences in both experiments among groups also, but longer latency are found in low education group at Cz location comparing medium education group, though not significant. We conclude that simple geometric shapes, organic shapes and their arrangements evoked visual N170 component at temporo-occipital areas with right lateralization and P300 ERP component at centro-parietal areas. Significant low amplitude of N170 and P300 ERP components and longer latencies during different shape stimuli in low education group prove that, low education significantly influence on visual cognitive functions in low education group.
The goal of this paper is to examine abstract, non-neuronal level concepts and processes of cognition and to introduce a model of episode processing which includes processing of perception and memory for ordered events, attentional processes, forgetting (including both constant and non-constant time-based decay), confusions and distinctiveness between items, and false memories and their suppression.
Domain adaption is a special transfer learning method, whose source domain and target domain generally have different data distribution, but need to complete the same task. There have been many significant types of research on domain adaptation in 2D images, but in 3D data processing, domain adaptation is still in its infancy. Therefore, we design a novel domain adaptive network to complete the unsupervised point cloud classification task. Specifically, we propose a multi-scale transform module to improve the feature extractor. Besides, a spatial-awareness attention module combined with channel attention to assign weights to each node is designed to represent hierarchically scaled features. We have validated the proposed method on the PointDA-10 dataset for domain adaption classification tasks. Empirically, it shows strong performance on par or even better than state-of-the-art.
The outbreak of the global COVID-19 pandemic has become a public crisis and is threatening human life in every country. Recently, researchers have developed testing methods via patients cough recordings. In order to improve the testing accuracy, in this paper, we establish a novel COVID-19 sound-based diagnosis framework, i.e. TFA-CLSTMNN, which integrates time-frequency domain features of the recorded cough with the Attention-Convolution Long Short-Term Memory Neural Network. Specifically, we calculate the Mel-frequency cepstrum coefficient (MFCC) of the cough data to extract the time-frequency domain features. We then apply the convolutional neural network and the attentional mechanism on the time-frequency features, which is followed by the long short-term memory neural network to analyze the MFCC features of the data. The recognition and classification can be then carried out to evaluate the positiveness or negativeness of the tested samples. Experimental results show that the proposed TFA-CLSTMNN framework outperforms the baseline neural networks in sound-based COVID-19 diagnosis and derives an accuracy over 0.95 on the public real-world datasets.
An adaptive perception system enables humanoid robots to interact with humans and their surroundings in a meaningful context-dependent manner. An important foundation for visual perception is the selectivity of early vision processes that enables the system to filter out low-level unimportant information while attending to features indicated as important by higher-level processes by way of top-down modulation. We present a novel way to integrate top-down and bottom-up processing for achieving such attention-based filtering. We specifically consider the case where the top-down target is not the most salient in any of the used submodalities.
A distinct property of robot vision systems is that they are embodied. Visual information is extracted for the purpose of moving in and interacting with the environment. Thus, different types of perception-action cycles need to be implemented and evaluated.
In this paper, we study the problem of designing a vision system for the purpose of object grasping in everyday environments. This vision system is firstly targeted at the interaction with the world through recognition and grasping of objects and secondly at being an interface for the reasoning and planning module to the real world. The latter provides the vision system with a certain task that drives it and defines a specific context, i.e. search for or identify a certain object and analyze it for potential later manipulation. We deal with cases of: (i) known objects, (ii) objects similar to already known objects, and (iii) unknown objects. The perception-action cycle is connected to the reasoning system based on the idea of affordances. All three cases are also related to the state of the art and the terminology in the neuroscientific area.
Computational systems for human–robot interaction (HRI) could benefit from visual perceptions of social cues that are commonly employed in human–human interactions. However, existing systems focus on one or two cues for attention or intention estimation. This research investigates how social robots may exploit a wide spectrum of visual cues for multiparty interactions. It is proposed that the vision system for social cue perception should be supported by two dimensions of functionality, namely, vision functionality and cognitive functionality. A vision-based system is proposed for a robot receptionist to embrace both functionalities for multiparty interactions. The module of vision functionality consists of a suite of methods that computationally recognize potential visual cues related to social behavior understanding. The performance of the models is validated by the ground truth annotation dataset. The module of cognitive functionality consists of two computational models that (1) quantify users’ attention saliency and engagement intentions, and (2) facilitate engagement-aware behaviors for the robot to adjust its direction of attention and manage the conversational floor. The performance of the robot’s engagement-aware behaviors is evaluated in a multiparty dialog scenario. The results show that the robot’s engagement-aware behavior based on visual perceptions significantly improve the effectiveness of communication and positively affect user experience.