Did something just break? - A Stream-Learning View on System Monitoring and Explainability

Abstract

Monitoring systems, which rapidly analyze live data to detect and describe anomalies and failures, enable operators to make informed decisions. Efficiently capturing complex, multi-feature interactions in real time, the need for such systems spans domains, ranging from industrial control and critical infrastructure to cloud services.

In this tutorial, we approach monitoring tasks from the perspective of data streams and concept drift: We first cover the automatic detection of anomalous behaviour by drift detection mechanisms. Afterward, we focus on a more detailed analysis of anomalies by applying drift localization and explanation techniques. Here, the ultimate goal is a machine-assisted identification of potential root causes. In this tutorial, we cover both practical algorithms and theoretical aspects. This allows for an in-depth analysis of proposed methods and serves as a conceptually sound bridge connecting drift detection, explainable AI, and causal analysis. Throughout the tutorial, we demonstrate an end-to-end workflow for critical infrastructure consisting of the discussed methodologies.

Organizers

Fabian Hinder, Valerie Vaquet, and Barbara Hammer

Bielefeld University