Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  • articleNo Access

    Space Aware BGRU Microservice Fault Detection Algorithm

    Microservice architecture is a new architecture pattern, which aims to provide users with more reliable, maintainable, and extensible software design services. However, with the continuous expansion of the scale of microservice application system, the proliferation of services and service interactions in the system make the system fault detection difficult. Detecting faults accurately and effectively is the key technology to ensure the system reliability and stability. From the perspective of microservice operation status and dependencies between services, this paper proposes a space-aware bidirectional gated recurrent unit (BGRU) microservice fault detection algorithm, which uses deep learning technology to mine hidden information that causes failures and combines space-aware attention to establish long-distance spatial dependency to improve the accuracy of model detection. The paper also conducts many experiments to demonstrate the effectiveness of the algorithm in microservice fault detection.

  • articleNo Access

    An Approach of Automated Anomalous Microservice Ranking in Cloud-Native Environments

    In recent years, more and more developers have been building applications based on the cloud-native architecture. Container and microservice are two essential components in the cloud-native architecture. Container technologies like Docker and Kubernetes can help developers achieve a consistent and scalable delivery for complex software applications. On the other hand, microservice technologies can facilitate the division of complex applications into multiple functionality-independent and composable components, which further increases the flexibility of applications. With the support of cloud computing platforms, cloud-native applications will be easier to manage and maintain, together with higher scalability. However, it is challenging to identify performance issues on microservices due to the complex runtime environments and the numerous monitoring metrics. Towards this issue, this paper proposes a novel root cause analysis approach. Our approach firstly constructs a service dependency graph based on the metrics collected in real time. Next, the anomaly weight of each microservice is automatically updated by extending the mRank algorithm. Finally, a PageRank-based random walk is adopted to rank root causes further, i.e. to rank potential problematic services. Experiments conducted on Kubernetes clusters show that the proposed approach achieves a good analysis result, which outperforms several baseline methods.

  • articleNo Access

    On Representing Resilience Requirements of Microservice Architecture Systems

    Together with the spread of DevOps practices and container technologies, Microservice Architecture has become a mainstream architecture style in recent years. Resilience is a key characteristic in Microservice Architecture (MSA) Systems, and it shows the ability to cope with various kinds of system disturbances which cause degradations of services. However, due to lack of consensus definition of resilience in the software field, although a lot of work has been done on resilience for MSA Systems, developers still do not have a clear idea on how resilient an MSA System should be, and what resilience mechanisms are needed.

    In this paper, by referring to existing systematic studies on resilience in other scientific areas, the definition of microservice resilience is provided and a Microservice Resilience Measurement Model is proposed to measure service resilience. And a requirement model to represent resilience requirements of MSA Systems is given. The requirement model uses elements in KAOS to represent notions in the measurement model, and decompose service resilience goals into system behaviors that can be executed by system components. As a proof of concept, a case study is conducted on an MSA System to illustrate how the proposed models are applied.

  • articleNo Access

    Graph-Based Root Cause Localization in Microservice Systems with Protection Mechanisms

    Service anomalies are difficult to locate accurately due to their propagation through service dependencies in microservice systems. Besides, the protection mechanisms are introduced into the microservice systems to ensure the stable operation of services. However, the existing approaches ignore the impact of protection mechanisms on the root cause localization of abnormal services. Specifically, the circuit breaking and rate limiting mechanisms can refuse service requests and thus change the way of anomaly propagation. Moreover, the different service request frequencies and latency make service dependencies change dynamically, resulting in the different probabilities of anomaly propagation among services. In this paper, we propose a novel framework named MicroGBPM to locate the root cause of abnormal services. We model the anomaly propagation among services as a dynamically constructed service attributed graph with metrics and traces when a failure occurs. To eliminate the impact of the protection mechanisms, we design a two-stage dynamic calibration strategy to adjust the probability of anomaly propagation among services. Then, we propose a random walking approach to calculate the root cause results by using the PageRank algorithm. The experimental results show that MicroGBPM improves the accuracy of root cause localization compared to other approaches in the microservice systems with protection mechanisms.