Scalable clustering by iterative partitioning and point attractor representation
- Publikationstyp:
- Zeitschriftenaufsatz
- Metadaten:
-
- Autoren
- Junming Shao
- Qinli Yang
- Hoang-Vu Dang
- Bertil Schmidt
- Stefan Kramer
- Sammlungen
- metadata
- ISSN
- 1556-4681
- Ausgabe der Veröffentlichung
- 1
- Zeitschrift
- ACM transactions on knowledge discovery from data
- Schlüsselwörter
- 004 Informatik
- 004 Data processing
- Sprache
- eng
- Paginierung
- Art. 5
- Datum der Veröffentlichung
- 2016
- Herausgeber
- ACM
- Herausgeber URL
- http://dx.doi.org/10.1145/2934688
- Datum der Datenerfassung
- 2020
- Datum, an dem der Datensatz öffentlich gemacht wurde
- 2020
- Zugang
- Public
- Titel
- Scalable clustering by iterative partitioning and point attractor representation
- Ausgabe der Zeitschrift
- 11
Datenquelle: METADATA.UB
- Andere Metadatenquellen:
-
- Autoren
- Junming Shao
- Qinli Yang
- Hoang-Vu Dang
- Bertil Schmidt
- Stefan Kramer
- Autoren-URL
- https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=fis-test-1&SrcAuth=WosAPI&KeyUT=WOS:000382878300005&DestLinkType=FullRecord&DestApp=WOS_CPL
- DOI
- 10.1145/2934688
- eISSN
- 1556-472X
- Externe Identifier
- Clarivate Analytics Document Solution ID: DV4EO
- ISSN
- 1556-4681
- Ausgabe der Veröffentlichung
- 1
- Zeitschrift
- ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA
- Schlüsselwörter
- High-performance algorithm
- scalable clustering
- synchronization
- Artikelnummer
- ARTN 5
- Datum der Veröffentlichung
- 2016
- Status
- Published
- Titel
- Scalable Clustering by Iterative Partitioning and Point Attractor Representation
- Sub types
- Article
- Ausgabe der Zeitschrift
- 11
Datenquelle: Web of Science (Lite)
- Abstract
- <jats:p>Clustering very large datasets while preserving cluster quality remains a challenging data-mining task to date. In this paper, we propose an effective scalable clustering algorithm for large datasets that builds upon the concept of synchronization. Inherited from the powerful concept of synchronization, the proposed algorithm, CIPA (Clustering by Iterative Partitioning and Point Attractor Representations), is capable of handling very large datasets by iteratively partitioning them into thousands of subsets and clustering each subset separately. Using dynamic clustering by synchronization, each subset is then represented by a set of point attractors and outliers. Finally, CIPA identifies the cluster structure of the original dataset by clustering the newly generated dataset consisting of points attractors and outliers from all subsets. We demonstrate that our new scalable clustering approach has several attractive benefits: (a) CIPA faithfully captures the cluster structure of the original data by performing clustering on each separate data iteratively instead of using any sampling or statistical summarization technique. (b) It allows clustering very large datasets efficiently with high cluster quality. (c) CIPA is parallelizable and also suitable for distributed data. Extensive experiments demonstrate the effectiveness and efficiency of our approach.</jats:p>
- Autoren
- Junming Shao
- Qinli Yang
- Hoang-Vu Dang
- Bertil Schmidt
- Stefan Kramer
- DOI
- 10.1145/2934688
- eISSN
- 1556-472X
- ISSN
- 1556-4681
- Ausgabe der Veröffentlichung
- 1
- Zeitschrift
- ACM Transactions on Knowledge Discovery from Data
- Sprache
- en
- Online publication date
- 2016
- Paginierung
- 1 - 23
- Datum der Veröffentlichung
- 2017
- Status
- Published
- Herausgeber
- Association for Computing Machinery (ACM)
- Herausgeber URL
- http://dx.doi.org/10.1145/2934688
- Datum der Datenerfassung
- 2022
- Titel
- Scalable Clustering by Iterative Partitioning and Point Attractor Representation
- Ausgabe der Zeitschrift
- 11
Datenquelle: Crossref
- Autoren
- Junming Shao
- Qinli Yang
- Hoang-Vu Dang
- Bertil Schmidt
- Stefan Kramer
- Zeitschrift
- ACM Trans. Knowl. Discov. Data
- Artikelnummer
- 1
- Paginierung
- 5:1 - 5:1
- Datum der Veröffentlichung
- 2016
- Titel
- Scalable Clustering by Iterative Partitioning and Point Attractor Representation.
- Ausgabe der Zeitschrift
- 11
Datenquelle: DBLP
- Beziehungen:
- Eigentum von