[Keyword: Failure] AND [All Categories: Computer Science] : Search

In this topic

Advanced Search

SEARCH GUIDE

Bestsellers

Handbook of Machine Learning
Volume 1: Foundation of Artificial Intelligence
by Tshilidzi Marwala

Handbook on Computational Intelligence
In 2 Volumes
edited by Plamen Parvanov Angelov

Handbook of Pattern Recognition and Computer Vision
5^th Edition
edited by C H Chen

Results: 1 - 2of2

Follow results:

refine search

Filters

per page:

Sort: Relevance

Context for search term 1Search term 1*

All Dates

LastSelect static range

Custom Range

Select starting monthSelect starting year

Select ending monthSelect ending year

Advanced

Search name	Searched On	Run search
[Keyword: Failure] AND [All Categories: Computer Science] (2)	30 Mar 2025	Run
[Keyword: Caltech] AND [All Categories: New Materials] (1)	30 Mar 2025	Run
[in Journal: International Journal of Modern Physics A] AND [Keyword: Θ13] (3)	30 Mar 2025	Run
[Keyword: Scratch] AND [All Categories: Mechanical Engineering] (1)	30 Mar 2025	Run
[Keyword: Failure] AND [All Categories: Computer Security] (1)	30 Mar 2025	Run

articleNo Access
FAILURE MANAGEMENT IN GRIDS: THE CASE OF THE EGEE INFRASTRUCTURE
Parallel Processing Letters01 Dec 2007
Preview Abstract
The emergence of Grid infrastructures like EGEE has enabled the deployment of large-scale computational experiments that address challenging scientific problems in various fields. However, to realize their full potential, Grid infrastructures need to achieve a higher degree of dependability, i.e., they need to improve the ratio of Grid-job requests that complete successfully in the presence of Grid-component failures. To achieve this, however, we need to determine, analyze and classify the causes of job failures on Grids. In this paper we study the reasons behind Grid job failures in the context of EGEE, the largest Grid infrastructure currently in operation. We present points of failure in a Grid that affect the execution of jobs, and describe error types and contributing factors. We discuss various information sources that provide users and administrators with indications about failures, and assess their usefulness based on error information accuracy and completeness. We describe two real-life case studies, describing failures that occurred on a production site of EGEE and the troubleshooting process for each case. Finally, we propose the architecture for a system that could provide failure management support to administrators and end-users of large-scale Grid infrastructures like EGEE.
articleNo Access
Multi-Objective Controller Failure Aware Capacitated Controller Placement in Software-Defined Networks
Journal of Interconnection Networks25 Apr 2022
Preview Abstract
Software-Defined Networking disassociates the control plane from data plane. The problem of deciding upon the number and locations of controllers and assigning switches to them has attracted the attention of researchers. Foreseeing the possibility of failure of a controller, a backup controller has to be maintained for each switch so that the switches assigned to the failed controller can immediately be connected to their backup controllers. Hence, the switches cannot experience disconnections in case of failure of their controller. In this paper, two mathematical models are proposed. The first model focuses on minimizing the average of latencies from all switches to their backup controllers while considering the failure of the controllers. The second model aims at minimizing both the average and worst-case of latencies from all switches to the corresponding backup controllers. Both of our models are evaluated on three networks and are compared (in terms of two metrics, viz., average and worst-case latencies) with an existing model that focuses on minimizing only worst-case latency. The first model gives better average latency compared to the reference model. The second model also gives better average latency and almost equal worst-case latency compared to the reference model.