Ant colony stream clustering: A fast density clustering algorithm for dynamic data streams

dc.cclicenceCC-BY-NCen
dc.contributor.authorFahy, Conoren
dc.contributor.authorYang, Shengxiangen
dc.contributor.authorGongora, Mario Augustoen
dc.date.acceptance2018-03-27en
dc.date.accessioned2018-04-05T07:40:34Z
dc.date.available2018-04-05T07:40:34Z
dc.date.issued2018-03-30
dc.description.abstractA data stream is a continuously arriving sequence of data and clustering data streams requires additional considerations to traditional clustering. A stream is potentially unbounded, data points arrive on-line and each data point can be examined only once. This imposes limitations on available memory and processing time. Furthermore, streams can be noisy and the number of clusters in the data and their statistical properties can change over time. This paper presents an on-line, bio-inspired approach to clustering dynamic data streams. The proposed Ant-Colony Stream Clustering (ACSC) algorithm is a density based clustering algorithm, whereby clusters are identified as high-density areas of the feature space separated by low-density areas. ACSC identifies clusters as groups of micro-clusters. The tumbling window model is used to read a stream and rough clusters are incrementally formed during a single pass of a window. A stochastic method is employed to find these rough clusters, this is shown to significantly speed the algorithm with only a minor cost to performance, as compared to a deterministic approach. The rough clusters are then refined using a method inspired by the observed sorting behaviour of ants. Ants pick-up and drop items based on the similarity with the surrounding items. Artificial ants sort clusters by probabilistically picking and dropping micro-clusters based on local density and local similarity. Clusters are summarised using their constituent micro-clusters and these summary statistics are stored offline. Experimental results show that the clustering quality of ACSC is scalable, robust to noise and favourable to leading ant-clustering and stream-clustering algorithms. It also requires fewer parameters and less computational time.en
dc.funderEPSRC (Engineering and Physical Sciences Research Council)en
dc.identifier.citationC. Fahy, S. Yang, and M. Gongora. Ant colony stream clustering: A fast density clustering algorithm for dynamic data streams. IEEE Transactions on Cybernetics, 49 (6), pp. 2215-2228en
dc.identifier.doihttps://doi.org/10.1109/TCYB.2018.2822552
dc.identifier.issn2168-2267
dc.identifier.urihttp://hdl.handle.net/2086/15733
dc.language.isoen_USen
dc.peerreviewedYesen
dc.projectidEP/K001310/1en
dc.publisherIEEE Pressen
dc.researchgroupCentre for Computational Intelligenceen
dc.researchinstituteInstitute of Artificial Intelligence (IAI)en
dc.subjectData stream clusteringen
dc.subjectdensity clusteringen
dc.subjectconcept driften
dc.subjectconcept evolutionen
dc.titleAnt colony stream clustering: A fast density clustering algorithm for dynamic data streamsen
dc.typeArticleen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
IEEETCYB18.pdf
Size:
2.13 MB
Format:
Adobe Portable Document Format
Description:
Main article
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.2 KB
Format:
Item-specific license agreed upon to submission
Description: