Master's Thesis Thomas Schreiber

 

A time series clustering approach for Building Automation and Control Systems

Tsne Copyright: EBC Two-dimensional visualization of a six-dimensional data set. Statistical features of 3822 time series, from the database of the E.ON Energy Research Center, are grouped by an algorithm (T-SNE).

In Germany, the amount of installed Building Automation and Control Systems (BACS) is increasing steadily. Their suboptimal operation is one of the major reasons for limitations in comfort and efficiency. Structured data of all sensors and actuators are a good foundation for decisions about control strategies and efficiency optimization. In practice, the analysis of BACS-data is a challenging and time-consuming task, as the recorded time series are usually vast, chaotic and confusing. To determine the correct time series of the respective component, the first step is to find groups in the data and label them manually, a very error-prone task. Algorithms from the field of machine learning achieve promising results in a variety of classification tasks and in previous work, it has been demonstrated that they also reach high classification accuracies applied on BACS-data.

The authors examined algorithms from the field of supervised learning, in which labelled training data is necessary. In this paper, we present a selection of the most promising unsupervised learning techniques and apply them on data extracted from the BACS of the E.ON Energy Research Center.

Unsupervised algorithms rely purely on similarities in the data, we use labels for validation only.

We train nine clustering algorithms with statistical features of the time series and unsupervised extracted features, generated by a deep convolutional auto-encoder. Further, we apply dynamic time warp, a raw-data-based time series clustering technique. Our investigations show that even the

most accurate unsupervised methods we apply, are unable to find the pre-defined 22 classes in the data but also observe that, generally, unsupervised clustering is possible. Subsequently we discuss the potential of acquiring knowledge from the generated clusters considering that the groups reveal valuable information about the data. Even if they do not represent the groups, which were defined by engineers in their work with the system.