|
|
|
|
|
Affinity Propagation-based Clustering For Data Streams |
|
PP: 2175-2183 |
|
Author(s) |
|
Walid Atwa,
Kan Li,
|
|
Abstract |
|
Clustering data stream is an active research area that has recently emerged to discover knowledge from large amounts of
continuously generated data. Several clustering algorithms have been proposed for static data. Nevertheless, data stream clustering
imposes several challenges to be addressed, such as dealing with dynamic data that arrive in an online fashion, capable of performing
fast and incremental processing of data objects, suitably addressing time and memory limitations, and how to handle the evolving
patterns that are important characteristics of streaming data with dynamic distributions. In this paper, we propose an algorithm that
extends Affinity Propagation (AP) to handle evolving data steam with dynamic distribution. Affinity Propagation was proposed as a
clustering algorithm extracted a set of exemplars that best represent the dataset using a message passing method. We present a semisupervised
clustering technique (SSAP) that incorporates labeled exemplars into the AP algorithm to deal with changes in the data
distribution, which requires the stream model to be updated as soon as possible. Experimental results with state-of-the-art data stream
clustering methods demonstrate the effectiveness and efficiency of the proposed method. |
|
|
|
|
|