Processing math: 100%
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  • articleNo Access

    PREDICTION IN HEALTH DOMAIN USING BAYESIAN NETWORKS OPTIMIZATION BASED ON INDUCTION LEARNING TECHNIQUES

    A Bayesian network is a directed acyclic graph in which each node represents a variable and each arc a probabilistic dependency; they are used to provide: a compact form to represent the knowledge and flexible methods of reasoning. Obtaining it from data is a learning process that is divided in two steps: structural learning and parametric learning. In this paper we define an automatic learning method that optimizes the Bayesian networks applied to classification, using a hybrid method of learning that combines the advantages of the induction techniques of the decision trees (TDIDT-C4.5) with those of the Bayesian networks. The resulting method is applied to prediction in health domain.

  • articleNo Access

    Using Clustering Algorithms to Improve the Production of Symbolic-Neural Rule Bases from Empirical Data

    Neurules are a kind of integrated rules integrating neurocomputing (via the adaline unit) and production rules. A neurule base is modular and natural, in contrast to existing connectionist knowledge bases, a comparable type of integrated knowledge bases. In producing neurules from an empirical data training set, the inability of the adaline unit to classify non-separable training data should be faced. The general approach followed is consecutively splitting the training set into two subsets, according to a splitting strategy, until (sub)sets of separable data are produced; then as many neurules as the resulted subsets are produced. In this paper, we present and experimentally evaluate six splitting strategies applied to the production process of a neurule base, three of which are based on clustering algorithms suitable for categorical data (i.e., 2-medoids, 2-modes and COOLCAT). Experiments were performed using 18 different distance or similarity metrics suitable for categorical data. No such an extensive comparison of distance/similarity metrics has been made so far. The strategy based on 2-modes generally performs better than the other strategies by applying alternative cluster center initialization methods. Specific distance/similarity metrics also provide better results.

  • articleNo Access

    Toward a Hybrid Approach Combining Deep Learning and Case-Based Reasoning for Phishing Email Detection

    Phishing attacks are increasing every year, both in terms of number and technique. Using only human weaknesses, an attacker can easily obtain the victim’s credentials or access their network. The problem persists despite many approaches offered by researchers, due to its dynamic nature, in which new phishing tactics are created every time. We, therefore, need more robust and effective methods to detect phishing emails. In this paper, we aim to detect phishing emails using the body text of the email with the hybrid approach combining case-based reasoning (CBR) and a deep learning model. Our proposed model, called DL-CBR, consists of a Bidirectional Long Short-Term Memory (Bi-LSTM) + Temporal Convolutional Network (TCN) network with an attention mechanism followed by a CBR classifier. The deep learning model is used for email representation, where it is trained using the N-pair loss function. To demonstrate the performance of DL-CBR, evaluation metrics, such as precision, accuracy, recall, and F-measure, were used, where we obtained an accuracy of 98.28%. The results show that our model outperformed other CBRs that utilize classical text representations like TF-IDF and Bag-of-Words. Additionally, while our model’s performance is slightly below that of the state-of-the-art models, it offers several advantages inherent to CBR. For instance, it can learn from new cases and update their database accordingly.