Story point estimation is a key practice in Agile project management that assigns effort values to user stories, helping teams manage workloads effectively. Inaccurate story point estimation can lead to project delays, resource misallocation and budget overruns. This study introduces Story Point Estimation using Reinforced Transformers (SPERT), a novel model that integrates transformer-based embeddings with reinforcement learning (RL) to improve the accuracy of story point estimation. SPERT utilizes Bidirectional Encoder Representations from Transformers (BERT) embeddings, which capture the deep semantic relationships within user stories, while the RL component refines predictions dynamically based on project feedback. We evaluate SPERT across multiple Agile projects and benchmark its performance against state-of-the-art models, including SBERT-XG, LHC-SE, Deep-SE and TF-IDF-SE. Results demonstrate that SPERT outperforms these models in terms of Mean Absolute Error (MAE), Median Absolute Error (MdAE) and Standardized Accuracy (SA). Statistical analysis using Wilcoxon tests and A12 effect size confirms the significance of SPERT’s performance, highlighting its ability to generalize across diverse projects and improve estimation accuracy in Agile environments.
This study presents a comparative analysis of transformer models for text classification, utilizing a hybrid approach that integrates rule-based regular expressions with fine-tuned neural network models. Initially, regular expressions are employed to annotate sentences in a cost-effective manner, providing an efficient alternative to manual labeling. The annotated dataset, comprising around 33,000 instances across three classes (Reminder, Scheduled Activity, General) and also restructured into two classes (Reminder and General) by merging “Scheduled Activity” with “Reminder”, is then used to fine-tune various transformer models, including DistilBERT, BERT, RoBERTa, ALBERT, Electra, Ernie 2.0, XLNet, and GPT-2. Our methodology involves freezing all layers except the final one during fine-tuning, allowing the models to learn nuanced linguistic patterns while mitigating overfitting. Results reveal that DistilBERT, despite its smaller size (66 million parameters), outperforms larger models such as BERT and GPT-2 in terms of accuracy, precision, recall, and F1-score. Likewise, we demonstrated that the proposed method can work better than generative AI large language models for both zero-shot and one-shot learning, namely GPT-3.5, GPT-4, GPT-4o and LLaMA-3 70B. This efficiency is attributed to the distillation process that retains essential features while reducing computational demands. Notably, DistilBERT achieved an overall accuracy of 0.86, significantly surpassing BERT’s 0.55, GPT-2’s 0.36, XLNet’s 0.51, Ernie 2.0’s 0.72, Electra’s 0.74, ALBERT’s 0.72, and RoBERTa’s 0.71. The study highlights the importance of model size and architecture in achieving optimal performance, especially in resource-constrained scenarios. This investigation underscores the efficacy of combining rule-based methods with advanced transformer models for text annotation, demonstrating that a balanced approach leveraging both handcrafted rules and learned representations can generalize better than relying solely on one technique. The proposed hybrid method offers a robust and adaptable solution for sentence annotation pipelines, enhancing performance in diverse natural language processing applications with limited labeled data. Code is available at https://github.com/arafet/Text-annotation-using-rule-based-method-and-Transformers.
Traditionally, magnetic component design has been based on power frequency transformers with sinusoidal excitation. However, the movement towards higher density integrated circuits means that reductions in the size of magnetic components must be achieved by operating at higher frequencies, mainly through nonsinusoidal switching circuits. As this trend continues, computing tools are required to carry out designs of magnetic components that also allow evaluation of the high frequency losses in these components. A computer design package is described here that implements a robust transformer design methodology allowing customizable transformer geometries. The concept of a critical frequency is a vital part of this methodology. In addition, the winding choice at high frequencies is optimized to give the most accurate results for the best possible speed. This paper includes a description of the software design processes used and describes the main aspects that were incorporated into the system.
Many problems in NLP such as language translation and sentiment analysis have shown a lot of improvement in recent years. As simpler language problems are solved or better understood, the focus shifts to more complex problems such as semantic analysis and understanding. Unfortunately, a lot of studies in the literature suffer from a too much specificity problem. The algorithms and datasets are too domain specific. In this study, we analyze and elaborate on this notion of generality. Instead of selecting a highly specialized data set for semantic analysis, we take a generic and possibly dry data set, and we study how a plain vanilla Transformer performs in learning higher level semantic patterns beyond what was obvious or expected. We tune our Transformer model on a classic language task to ensure correct performance. Once tuned, the goal is to select sentences with specific key words and study whether higher level semantic patterns may have been learned by our model. We believe that we obtained promising results. The average BLEU score for sentences less than 25 words is equal to 39.79. Our initial qualitative analysis of possible semantic content of interest shows a 17 percent rate in finding interesting semantic patterns. We provide discussion of data driven results of unexpectedness as a measure of semantic learning.
The exponentially increasing amount of data generated by the public on social media platforms is a precious source of information. It can be used to find the topics and analyze the comments. Some researchers have extended the Latent Dirichlet Allocation (LDA) method by adding a sentiment layer to simultaneously find the topics and their related sentiments. However, most of these approaches do not achieve admirable accuracy in Topic Sentiment Analysis (TSA), particularly when there is insufficient training data or the texts are complex, ambiguous, and short. In this paper, a self-supervised novel approach called SSTSA is proposed for TSA that extracts the hidden topics and analyzes the total sentiment related to each topic. The SSTSA proposes a new method called Pseudo-label Generator. For this purpose, first, it employs semantic similarity and Word Mover’s Distance (WMD) measures. Then, the document embedding technique is employed to semantically estimate the sentiment orientation of samples and generate the pseudo-labels (positive or negative). Afterward, a hybrid classifier composed of a pre-trained Robustly Optimized BERT (RoBERTa) and a Long Short-Term Memory (LSTM) model is trained to predict the sentiment of unseen data. The evaluation results on different datasets of various domains demonstrate that the SSTSA outperforms similar unsupervised/self-supervised methods.
Visual saliency models mimic the human visual system to gaze towards fixed pixel positions and capture the most conspicuous regions in the scene. They have proved their efficacy in several computer vision applications. This paper provides a comprehensive review of the recent advances in eye fixation prediction and salient object detection, harnessing deep learning. It also provides an overview on multi-modal saliency prediction that considers audio in dynamic scenes. The underlying network structure and loss function for each model are explored to realise how saliency models work. The survey also investigates the inclusion of specific low-level priors in deep learning-based saliency models. The public datasets and evaluation metrics are succinctly introduced. The paper also makes a discussion on the key issues in saliency modeling along with some open problems and growing research directions in the field.
To create curiosity and interest for a topic in online learning is a challenging task. A good preview that outlines the contents of a learning pathway could help learners know the topic and get interested in it. Towards this end, we propose a hierarchical title generation approach to generate semantically relevant titles for the learning resources in a learning pathway and a title for the pathway itself. Our approach to Automatic Title Generation for a given text is based on pre-trained Transformer Language Model GPT-2. A pool of candidate titles are generated and an appropriate title is selected among them which is then refined or de-noised to get the final title. The model is trained on research paper abstracts from arXiv and evaluated on three different test sets. We show that it generates semantically and syntactically relevant titles as reflected in ROUGE, BLEU scores and human evaluations. We propose an optional abstractive Summarizer module based on pre-trained Transformer model T5 to shorten medium length documents. This module is also trained and evaluated on research papers from arXiv dataset. Finally, we show that the proposed model of hierarchical title generation for learning pathways has promising results.
Medical image segmentation plays a crucial role in clinical diagnosis and therapy systems, yet still faces many challenges. Building on convolutional neural networks (CNNs), medical image segmentation has achieved tremendous progress. However, owing to the locality of convolution operations, CNNs have the inherent limitation in learning global context. To address the limitation in building global context relationship from CNNs, we propose LGNet, a semantic segmentation network aiming to learn local and global features for fast and accurate medical image segmentation in this paper. Specifically, we employ a two-branch architecture consisting of convolution layers in one branch to learn local features and transformer layers in the other branch to learn global features. LGNet has two key insights: (1) We bridge two-branch to learn local and global features in an interactive way; (2) we present a novel multi-feature fusion model (MSFFM) to leverage the global contexture information from transformer and the local representational features from convolutions. Our method achieves state-of-the-art trade-off in terms of accuracy and efficiency on several medical image segmentation benchmarks including Synapse, ACDC and MOST. Specifically, LGNet achieves the state-of-the-art performance with Dice’s indexes of 80.15% on Synapse, of 91.70% on ACDC, and of 95.56% on MOST. Meanwhile, the inference speed attains at 172 frames per second with 224×224 input resolution. The extensive experiments demonstrate the effectiveness of the proposed LGNet for fast and accurate for medical image segmentation.
The Bangla Language ranks seventh in the list of most spoken languages with 265 native and non-native speakers around the world and the second Indo-Aryan language after Hindi. However, the growth of research for tasks such as sentiment analysis (SA) in Bangla is relatively low compared to SA in the English language. It is because there are not enough high-quality publically available datasets for training language models for text classification tasks in Bangla. In this paper, we propose a Bangla annotated dataset for sentiment analysis on the ongoing Ukraine–Russia war. The dataset was developed by collecting Bangla comments from various videos of three prominent YouTube TV news channels of Bangladesh covering their report on the ongoing conflict. A total of 10,861 Bangla comments were collected and labeled with three polarity sentiments, namely Neutral, Pro-Ukraine (Positive), and Pro-Russia (Negative). A benchmark classifier was developed by experimenting with several transformer-based language models all pre-trained on unlabeled Bangla corpus. The models were fine-tuned using our procured dataset. Hyperparameter optimization was performed on all 5 transformer language models which include: BanglaBERT, XLM-RoBERTa-base, XLM-RoBERTa-large, Distil-mBERT and mBERT. Each model was evaluated and analyzed using several evaluation metrics which include: F1 score, accuracy, and AIC (Akaike Information Criterion). The best-performing model achieved the highest accuracy of 86% with 0.82 F1 score. Based on accuracy, F1 score and AIC, BanglaBERT outperforms baseline and all the other transformer-based classifiers.
Traffic prediction is challenging due to the stochastic nonlinear dependencies in spatiotemporal traffic characteristics. We introduce a Graph Convolutional Gated Recurrent Unit Network (GC-GRU-N) to capture the critical spatiotemporal dynamics. Using 15-min aggregated Seattle loop detector data, we recontextualize the prediction challenge across space and time. We benchmark our model against Historical Average, LSTM, and Transformers. While Transformers outperformed other models, our GC-GRU-N came in a close second with notably faster inference time — six times quicker than Transformers. We offer a comprehensive comparison of all models based on training and inference times, MAPE, MAE, and RMSE. Furthermore, we delve into the spatial and temporal characteristics of each model’s performance.
Automatic Image Captioning (AIC) refers to the process of synthesizing semantically and syntactically correct descriptions for images. Existing research on AIC has predominantly focused on the English language. Comparatively, lower numbers of works have focused on developing captioning systems for low-resource Indian languages like Assamese. This paper investigates AIC for the Assamese language using two distinct approaches. The first approach involves utilizing state-of-the-art AIC model pretrained on an English image-caption dataset to generate English captions for input images. Next, these English captions are translated to the Assamese language using a publicly available automatic translator. The second approach involves exclusively training the AIC model using an Assamese image-caption dataset to predict captions directly in Assamese. The experiments are performed on two types of state-of-art models, one which uses LSTM as a decoder and the other one uses a transformer. Through extensive experimentation, the performance of these approaches is evaluated both quantitatively and qualitatively. The quantitative results are obtained using automatic metrics such as BLEU-n and CIDEr. For qualitative analysis, human evaluation is performed. The comparative performances between the two approaches reveal that models trained exclusively on Assamese image-caption datasets achieve superior results both in terms of quantitative measures and qualitative assessment when compared to models pretrained on English and subsequently translated into Assamese.
In literary narratives, events often transcend the boundaries of reality, encompassing a blend of real-world occurrences and imaginative constructs. This combination poses a unique challenge. Unlike domains like biomedical or news articles, events in literary narratives are not always based on facts. Distinguishing between these realis and non-realis events is crucial for a nuanced understanding of the narrative’s thematic underpinnings and the author’s stylistic choices. To address this challenge, we used the “Gatha-200” dataset, a collection of 200 short stories annotated for realis events. We employed a variety of neural models, including transformers, to detect realis events in the “Gatha-200” dataset. The transformer model achieved an impressive F1-score of 93.9%, showcasing its proficiency in identifying real-world events within literary contexts. Furthermore, we investigated the utility of prompt-based learning techniques for detecting realis events in zero-shot and few-shot scenarios. This approach proved to be effective, demonstrating the adaptability of our models to varying degrees of training data availability.
The past few years have witnessed machine learning techniques take the limelight in multiple research domains. One such domain that has reaped the benefits of machine learning is computer-aided drug discovery, where the search space for candidate drug molecules is decreased using methods such as virtual screening. Current state-of-the-art sequential neural network models have shown promising results and we would like to replicate similar results with virtual screening using the encoded molecular information known as simplified molecular-input line-entry system (SMILES). Our work includes the use of attention-based sequential models — the long short-term memory with attention and an optimized version of the transformer network specifically designed to deal with SMILES (ChemBERTa). We also propose the “Overall Screening Efficacy”, an averaging metric that aggregates and encapsulates the model performance over multiple datasets. We found an overall improvement of about 27% over the benchmark model, which relied on parallelized random forests.
Recently, many machine learning models have been proposed to understand and analyze Programming Languages (PLs). While there are some similarities between PLs and Natural Language Processing (NLP), the former one has its own unique challenges. In this survey, we investigate current approaches tackling representation learning of codes and associated downstream tasks that can be solved with them. We present and compare the state-of-the-art models specifically designed for embedding PLs in low-dimensional space, and demonstrate how these embedding methods are related to representation learning approaches in NLP. We also compare benchmark experiments on multiple code-related tasks and evaluate the models for each specific application.
Predicting protein side-chains is important for both protein structure prediction and protein design. Modeling approaches to predict side-chains such as SCWRL4 have become one of the most widely used tools of its type due to fast and highly accurate predictions. Motivated by the recent success of AlphaFold2 in CASP14, our group adapted a 3D equivariant neural network architecture to predict protein side-chain conformations, specifically within a protein-protein interface, a problem that has not been fully addressed by AlphaFold2.
Please login to be able to save your searches and receive alerts for new content matching your search criteria.