Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Background: The use of chatbots has increased considerably in recent years. These are used in different areas and by a wide variety of users. Due to this fact, it is essential to incorporate usability in their development. Aim: Our objective is to identify the state-of-the-art in chatbot usability and applied human–computer interaction techniques, to analyze how to evaluate chatbot usability. Method: We have conducted a systematic mapping study, by searching the main scientific databases. The search retrieved 170 references and 21 articles were retained as primary studies. Results: The works were categorized according to four criteria: usability techniques, usability characteristics, research methods and type of chatbots. Conclusions: Chatbot usability is still a very incipient field of research where the published studies are mainly surveys, usability tests, and rather informal experimental studies. Hence, it becomes necessary to perform more formal experiments to measure user experience, and exploit these results to provide usability-aware design guidelines.
Detecting intents and extracting necessary contextual information (aka named entities) in input utterances are two fundamental tasks in understanding what the users say in chatbot systems. While most work in this field has been dedicated to high-resource languages in popular domains like business and home automation, little research has been done for low-resource languages, especially in a less popular domain like education. To narrow this gap, this paper presents the first study on learning to detect student intents and to recognize named entities in the education domain targeted to the Vietnamese language. Specifically, we first introduce a complete corpus consisting of 3690 utterances of students. It was manually annotated with both named entities and intent information at two levels of granularity: the fine-grained and the coarse-grained levels. We then systematically investigate different approaches to deal with the two tasks using not only independent but also joint learning architectures. The experimental results show that the joint architectures based on pre-trained language models are superior in boosting the performance of both tasks. They outperformed the conventional independent learning architectures which looked at the two tasks separately. Moreover, to further enhance the final performance, this paper proposes a technique to enrich the models with more useful linguistic features. Compared to the standard approaches, we achieve considerably better results for two tasks in both architectures. Overall, for the named entity recognition task, the best model yielded an F1 score of 88.61%. For the intent detection task, it yielded F1 scores of 94.36% and 91.62% at the coarse-grained and fine-grained levels, respectively.
The following topics are under this section:
This study explores the short- and long-term expectations about adoption of conversational agents in the organizational frontline. Drawing from in-depth interviews with managers and developers in organizations that have implemented these agents, it sheds light on how the deployment of and collaboration with technology-based autonomous agents influences service activities, and expectations of changes in organizational processes. The interviews revealed that implementations are done on a rather iterative and experimental basis, often balancing the current limits of technology while aiming at improved efficiency, augmenting the work of frontline employees, and meeting the ever-growing demands from consumers.
This study explored the criteria for evaluating human–chatbot collaboration in customer service using the socio-technical systems theory. Interviews with 28 customer service managers, conversational designers, and human agents revealed their evaluation criteria diverged. Managers prioritized the chatbot’s technical capabilities and preferred a standalone customer service chatbot, while conversational designers prioritized the social aspect, focusing on the chatbot as a support tool for human agents and customers. Human agents mainly evaluated the chatbot as a means to increase job satisfaction. These distinct criteria highlight the importance of an aligned approach in human–chatbot collaboration. To improve this collaboration, integrating the chatbot into customer data systems was suggested.
Most existing commercial goal-oriented chatbots are diagram-based; i.e. they follow a rigid dialog flow to fill the slot values needed to achieve a user’s goal. Diagram-based chatbots are predictable, thus their adoption in commercial settings; however, their lack of flexibility may cause many users to leave the conversation before achieving their goal. On the other hand, state-of-the-art research chatbots use Reinforcement Learning (RL) to generate flexible dialog policies. However, such chatbots can be unpredictable, may violate the intended business constraints, and require large training datasets to produce a mature policy. We propose a framework that achieves a middle ground between the diagram-based and RL-based chatbots: we constrain the space of possible chatbot responses using a novel structure, the chatbot dependency graph, and use RL to dynamically select the best valid responses. Dependency graphs are directed graphs that conveniently express a chatbot’s logic by defining the dependencies among slots: all valid dialog flows are encapsulated in one dependency graph. Our experiments in both single-domain and multi-domain settings show that our framework quickly adapts to user characteristics and achieves up to 23.77% improved success rate compared to a state-of-the-art RL model.
Currently, Natural Language Processing (NLP) applications like chatbots are very close to mimick human responses. This has been achieved via powerful and sophisticated models like Bidirectional Encoder Representations from Transformers (BERT). Although, the capabilities that such models offer are superior to the technologies that preceded it, these models still possess bias. BERT or similar models are mostly trained on text corpora that deviate in important ways from the text encountered by a chatbot in a problem-specific context. Past research on NLP bias has heavily focused on measuring and mitigating bias with respect to protected attributes (stereotyping like gender, race, ethnicity, etc.), but the exploration of model bias with respect to classification labels remained yet to be explored. We investigate how a classification model hugely favors one class with respect to another. In this paper, we propose a bias evaluation technique called directional pairwise class confusion bias that highlights our chatbot intent classification models bias on pairs of classes. Lastly, we also demonstrate two bias mitigation strategies on a few example-biased pairs.
Emerging technologies (e.g., augmented reality, virtual reality) have changed the way that people interact. Artificial intelligence and machine learning can be used for personalised targeting and predicting what people “want” or “need” before they even know it themselves. With these tools comes the possibility of manipulation, deceit or the production of harmful results through errors and misuse. User participation with these technologies can also have unintended consequences, leading to questions of shared social responsibility between social marketers, users and technology creators. In the race to do exciting and helpful things with these technologies, ethical issues can be an afterthought. This chapter argues for nuanced analyses of social responsibility and ethics, taking seriously the role of technology throughout social marketing campaign conception, design and implementation.