A Framework for Anomaly Detection under Dynamic and Distributed Scenarios
Anomaly Detection is an important aspect of many application domains. It refers to the problem of finding patterns in data that do not conform to expected behaviour. Hence, understanding of expected behaviour well is fundamental to performing effective anomaly detection. However, data profiles constantly evolve in certain domains such as computer networks. In other domains such as traffic monitoring and healthcare, data are distributed and are either too large or there are privacy concerns in transmitting them to a central location. These situations pose a challenge to obtain an accurate understanding of non-anomalous profiles. Changing profiles undermine existing anomaly detection models and make them less effective. Training a robust model with data from multiple sources is also challenging. Moreover, in real world scenarios, it is not apparent how an anomaly detection model can be built to address the problem.
This thesis focuses on the building of a robust anomaly detection system where data profiles evolve and/or are distributed. It proposes a novel Online Offline Framework to separate existing expected behaviour, new possible expected behaviour and anomalies in streaming data. It also addresses the distributed scenario using a theoretically sound fully Bayesian approach. These methods improve performances of anomaly detection systems and work well with biased and uneven data partitions.
The proposed methods are validated using real world data in three different domains. This thesis identifies the implementation difficulties in these domains and produces three novel methodologies to address each of the core anomaly detection problems.