Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  • articleNo Access

    Latent Representation Prediction Networks

    Modern model-based reinforcement learning methods for high-dimensional inputs often incorporate an unsupervised learning step for dimensionality reduction. The training objective of these unsupervised learning methods often leverages only static inputs such as reconstructing observations. These representations are combined with predictor functions for simulating rollouts to navigate the environment. We advance this idea by taking advantage of the fact that we navigate dynamic environments with visual stimulus and create a representation that is specifically designed with control and actions in mind. We propose to learn a feature map that is maximally predictable for a predictor function. This results in representations that are well suited for the task of planning, where the predictor is used as a forward model. To this end, we introduce a new way of learning this representation along with the prediction function, a system we dub Latent Representation Prediction Network (LARP). The prediction function is used as a forward model for a search on a graph in a viewpoint-matching task, and the representation learned to maximize predictability is found to outperform other representations. The sample efficiency and overall performance of our approach are shown to rival standard reinforcement learning methods, and our learned representation transfers successfully to unseen environments.

  • articleNo Access

    An Interpretable Time Series Clustering Neural Network Based on Shape Feature Extraction

    Time series is a very common but important data type. A large number of time series data are generated in various professional research fields and daily life. Although there are many models being developed to deal with time series, the cluster methods for time series are insufficient and need to improve. This paper is focused on time series clustering, which uses deep learning approach to discover the shape characteristics of time series. We establish a new neural network model of time series clustering to jointly optimize the representation learning and clustering tasks of time series. Focusing on shape features with time series, we built the Soft-DTW layer into the neural network to learn the interpretable time series representation. Maximized regularization mutual information is used to jointly optimize representation learning and clustering tasks. Experiments show that this model can help obtain an excellent representation of time series. In comparison with the benchmark model, the best clustering effect is achieved in the proposed model on multiple data sets. This model has broad applicability in time series data.

  • articleNo Access

    Deepfake Speech Recognition and Detection

    Deepfake technology, especially deep voice, which has been derived from artificial intelligence in recent years, is potentially harmful, and the public is not yet wary. However, many speech synthesis models measure the degree of true restitution by Mean Opinion Rating (MOS), a subjective assessment of naturalness and quality of speech by human subjects, but in future it will be difficult to distinguish the interlocutor’s identity through the screen. For this reason, this study addresses the threat posed by this new technology by combining representational learning and 0transfer learning in two sub-systems: a recognition system and a voice print system. The recognition system is responsible for the detection of which voice is a fake voice generated by speech conversion or speech synthesis techniques, while the acoustic system is responsible for the verification of the speaker’s identity through acoustic features. In the speech recognition system, we use the representation learning method and the transfer classification method. We use X-vector data for training, and then fine-tune the model using four types of marker data to learn the representation vectors of real and fake voice, and use support vector machine to classify real and fake voice in the back-end to reduce the negative effect of the new technique.

  • articleNo Access

    Representation Learning Method Based on Improved Random Walk for Influence Maximization

    The purpose of the influence maximization problem is to determine a subset to maximize the number of affected users. This problem is very crucial for information dissemination in social networks. Most traditional influence maximization methods usually focus too heavily on the information diffusion model and randomly set influence parameters, resulting in inaccurate final outcomes. Driven by the recent criticisms of the diffusion model and the rapid development of representation learning, this paper proposes a representation learning method based on improved random walk for influence maximization (IRWIM) to maximize the influence spread. The IRWIM algorithm improves the traditional random walk and adopts multi-task neural network architecture to predict the propagation ability of nodes more accurately. Moreover, the greedy strategy is utilized to continuously optimize the marginal gain while retaining the theoretical guarantee. IRWIM is tested on four genuine datasets. Experimental results show that the accuracy of the proposed algorithm is superior to various competitive algorithms in the field of influence maximization.

  • articleNo Access

    Representation Learning Based on Vision Transformer

    In recent years, with the rapid development of information technology, the volume of image data has grown exponentially. However, these datasets typically contain a large amount of redundant information. To extract effective features and reduce redundancy from images, a representation learning method based on the Vision Transformer (ViT) has been proposed, and to our best knowledge, Transformer was first applied to zero-shot learning (ZSL). The method adopts a symmetric encoder–decoder structure, where the encoder incorporates Multi-Head Self-Attention (MSA) mechanism of ViT to reduce the dimensionality of image features, eliminate redundant information, and decrease computational burden. Consequently, it effectively extracts features, and the decoder is utilized for reconstructing image data. We evaluated the representation learning capability of the proposed method in various tasks, including data visualization, image reconstruction, face recognition, and ZSL. By comparing with state-of-the-art representation learning methods, the outstanding results obtained validate the effectiveness of this method in the field of representation learning.

  • articleNo Access

    Drug-target Interaction Prediction by Metapath2vec Node Embedding in Heterogeneous Network of Interactions

    Drug discovery is a complicated, time-consuming and expensive process. The cost for each new molecular entity (NME) is estimated at $1.8 billion. Furthermore, for a new drug to be FDA approved it often takes nearly a decade and approximately 20 new drugs being approved by the US Food and Drug Administration (FDA) each year. Accurately predicting drug-target interactions (DTIs) by computational methods is an important area of drug research, which brings about a broad prospect for fast and low-risk drug development. By accurate prediction of drugs and targets interactions scientists can scale-down huge experimental space and reduce the costs and help to faster drug development as well as predicting the side effects and potential function of new drugs. Many approaches have been taken by researchers to solve DTI problem and enhance the accuracy of methods. State-of-the-art approaches are based on various techniques, such as deep learning methods-like stacked auto-encoder-, matrix factorization, network inference, and ensemble methods. In this work, we have taken a new approach based on node embedding in a heterogeneous interaction network to obtain the representation of each node in the interaction network and then use a binary classifier such as logistic regression to solve this prominent problem in the pharmaceutical industry. Most introduced network-based methods use a homogeneous network of interactions as their input data whereas in the real word problem there exist other informative networks to help to enhance the prediction and by considering the homogeneous networks we lose some precious network information. Hence, in this work, we have tried to work on the heterogeneous network and have improved the accuracy of methods in comparison to baseline methods.

  • articleNo Access

    Enhancing Speech Assistive Systems Through a Sequence-to-Vector Representation Approach for Disordered Speech

    Speech assistive system for people with neuro disorders is a highly challenging task till date. Any kind of neuro cognitive disability affects the speech production mechanism that leads to speech impairment. Representation learning methods have recently emerged to improve the outcome of machine learning algorithms. In case of complex recognition tasks such as disordered speech recognition, learning compact and efficient representations for disordered speech utterances is important. Recent deep learning-based architectures need sufficiently large amount of impaired speech samples which are tedious with respect to neurologically disabled people. In this work, we focus on proposing a representation learning approach that uses traditional sequential model such as Hidden Markov Model (HMM) which works moderately well with small amount of impaired speech data. We propose a novel sequence to vector-based HMM State Sequence (HMM-SS) approach which is very compact and has proved to be an efficient representation for disordered speech utterances. The efficiency of the proposed HMM-SS approach is assessed using four datasets, namely 50 words of TORGO, 100-common words dataset of the UA-SPEECH, 50 help-seeking words and 100-common words of Impaired speech corpus in Tamil corpus. The proposed approach outperforms the baseline HMM, DNN-HMM and a recent state-of-the-art approach for all four datasets. The discriminative ability and the compactness of the proposed representation proved effective for disordered speech recognition.

  • articleNo Access

    Company2Vec — German Company Embeddings Based on Corporate Websites

    With Company2Vec, the paper proposes a novel application in representation learning. The model analyzes business activities from unstructured company website data using Word2Vec and dimensionality reduction. Company2Vec maintains semantic language structures and thus creates efficient company embeddings in fine-granular industries. These semantic embeddings can be used for various applications in banking.

    Direct relations between companies and words allow semantic business analytics (e.g., top-n words for a company). Furthermore, industry prediction is presented as a supervised learning application and evaluation method. The vectorized structure of the embeddings allows measuring companies’ similarities with the cosine distance. Company2Vec hence offers a more fine-grained comparison of companies than the standard industry labels (NACE). This property is relevant for unsupervised learning tasks, such as clustering. An alternative industry segmentation is shown with k-means clustering on the company embeddings. Finally, this paper proposes three algorithms for (1) firm-centric, (2) industry-centric and (3) portfolio-centric peer-firm identification.

  • articleNo Access

    Instance type completion in equipment knowledge graph based on translation model

    Knowledge graph completion is one of the steps of knowledge graph construction. It aims to complete the incomplete triple data in the initially constructed knowledge graph and make the data in the knowledge graph more abundant and complete. At present, there are few studies on knowledge graph completion in the equipment field, and the attribute and relationship samples in the equipment knowledge graph are prone to uneven distribution, while the traditional knowledge graph completion method is difficult to solve the problem of uneven distribution of attribute and relationship samples. Therefore, this paper proposes an internal instance type completion method of equipment knowledge graph based on EP2TP-TRT. First, the TransE model is used to embed the relationship and attributes of the equipment instance, respectively. Then, the EP2TP model is used to map the instance attributes, and the TRT model is used to map the type relationship. Finally, the scores of the EP2TP and TRT models are integrated by designing different weights, and the training prediction is carried out to enhance the representation ability of instance information and type information. Compared with the mainstream advanced models, this method improves the MRR and HITS @ 1 indicators by about 0.89% and 2.1%, respectively.

  • articleOpen Access

    Real-Time Change Detection with Convolutional Density Approximation

    Background Subtraction (BgS) is a widely researched technique to develop online Change Detection algorithms for static video cameras. Many BgS methods have employed the unsupervised, adaptive approach of Gaussian Mixture Model (GMM) to produce decent backgrounds, but they lack proper consideration of scene semantics to produce better foregrounds. On the other hand, with considerable computational expenses, BgS with Deep Neural Networks (DNN) is able to produce accurate background and foreground segments. In our research, we blend both approaches for the best. First, we formulated a network called Convolutional Density Approximation (CDA) for direct density estimation of background models. Then, we propose a self-supervised training strategy for CDA to adaptively capture high-frequency color distributions for the corresponding backgrounds. Finally, we show that background models can indeed assist foreground extraction by an efficient Neural Motion Subtraction (NeMos) network. Our experiments verify competitive results in the balance between effectiveness and efficiency.

  • chapterOpen Access

    Cross-modal representation alignment of molecular structure and perturbation-induced transcriptional profiles

    Modeling the relationship between chemical structure and molecular activity is a key goal in drug development. Many benchmark tasks have been proposed for molecular property prediction, but these tasks are generally aimed at specific, isolated biomedical properties. In this work, we propose a new cross-modal small molecule retrieval task, designed to force a model to learn to associate the structure of a small molecule with the transcriptional change it induces. We develop this task formally as multi-view alignment problem, and present a coordinated deep learning approach that jointly optimizes representations of both chemical structure and perturbational gene expression profiles. We benchmark our results against oracle models and principled baselines, and find that cell line variability markedly influences performance in this domain. Our work establishes the feasibility of this new task, elucidates the limitations of current data and systems, and may serve to catalyze future research in small molecule representation learning.

  • chapterOpen Access

    Interpreting Potts and Transformer Protein Models Through the Lens of Simplified Attention

    The established approach to unsupervised protein contact prediction estimates coevolving positions using undirected graphical models. This approach trains a Potts model on a Multiple Sequence Alignment. Increasingly large Transformers are being pretrained on unlabeled, unaligned protein sequence databases and showing competitive performance on protein contact prediction. We argue that attention is a principled model of protein interactions, grounded in real properties of protein family data. We introduce an energy-based attention layer, factored attention, which, in a certain limit, recovers a Potts model, and use it to contrast Potts and Transformers. We show that the Transformer leverages hierarchical signal in protein family databases not captured by single-layer models. This raises the exciting possibility for the development of powerful structured models of protein family databases.

  • chapterOpen Access

    Contrastive learning of protein representations with graph neural networks for structural and functional annotations

    Although protein sequence data is growing at an ever-increasing rate, the protein universe is still sparsely annotated with functional and structural annotations. Computational approaches have become efficient solutions to infer annotations for unlabeled proteins by transferring knowledge from proteins with experimental annotations. Despite the increasing availability of protein structure data and the high coverage of high-quality predicted structures, e.g., by AlphaFold, many existing computational tools still only rely on sequence data to predict structural or functional annotations, including alignment algorithms such as BLAST and several sequence-based deep learning models. Here, we develop PenLight, a general deep learning framework for protein structural and functional annotations. Pen-Light uses a graph neural network (GNN) to integrate 3D protein structure data and protein language model representations. In addition, PenLight applies a contrastive learning strategy to train the GNN for learning protein representations that reflect similarities beyond sequence identity, such as semantic similarities in the function or structure space. We benchmarked PenLight on a structural classification task and a functional annotation task, where PenLight achieved higher prediction accuracy and coverage than state-of-the-art methods.

  • chapterNo Access

    Temporal Knowledge Graph Embedding for Metro Flow Analysis

    The study of metro flow has become a hot topic and an important research element in urban computing. Metro flow is affected by the topology of metro network and the POI around metro stations. It is a challenge to extract the effective patterns of metro flow from the complex data. In this chapter, we construct metro flow knowledge graph to depict the topological relations between metro stations, the relations between metro stations and POIs, and the relations between metro stations and flows. Since metro flow changes over time, we add the corresponding time constrain to each relation. In addition, due to the complexity of relations, the time granularity of the constraints on the relations varies. Therefore, for effective knowledge graph representation learning, we propose a multi-temporal granularity metro flow temporal knowledge graph (MGMF-TKG) embedding method. Specifically, first, we construct a metro flow temporal knowledge graph from metro flow data and use a combination of sinusoidal waves to represent temporal information. Then, a multi-time granularity knowledge graph representation learning framework is constructed to realize the representation of relations between entities under complex time information constraints. Finally, experiments are conducted using Chongqing metro flow data, and the experimental results show that the method outper-forms the benchmark methods.