Diagnosing Cloud Performance Anomalies Using Large Time Series Dataset Analysis

Jehangiri, Ali Imran; Yahyapour, Ramin; Wieder, Philipp; Yaqub, Edwin; Lu, Kuan

doi:10.1109/CLOUD.2014.129

Diagnosing Cloud Performance Anomalies Using Large Time Series Dataset Analysis

Journal

IEEE 7th International Conference on Cloud Computing

Date Issued

2014

Author(s)

Jehangiri, Ali Imran

Yahyapour, Ramin

Wieder, Philipp

Yaqub, Edwin

Lu, Kuan

DOI

10.1109/CLOUD.2014.129

Abstract

Virtualized Cloud platforms have become increasingly common and the number of online services hosted on these platforms is also increasing rapidly. A key problem faced by providers in managing these services is detecting the performance anomalies and adjusting resources accordingly. As online services generate a very large amount of monitored data in the form of time series, it becomes very difficult to process this complex data by traditional approaches. In this work, we present a novel distributed parallel approach for performance anomaly detection. We build upon Holt-Winters forecasting for automatic aberrant behavior detection in time series. First, we extend the technique to work with MapReduce paradigm. Next, we correlate the anomalous metrics with the target Service Level Objective (SLO) in order to locate the suspicious metrics. We implemented and evaluated our approach on a production Cloud encompassing IaaS and PaaS service models. Experimental results confirm that our approach is efficient and effective in capturing the metrics causing performance anomalies in large time series datasets.

google-scholar

Views

Downloads

Options

Diagnosing Cloud Performance Anomalies Using Large Time Series Dataset Analysis