Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 How many clusters to report: a recursive heuristic.

Literature DB >> 21096553

How many clusters to report: a recursive heuristic.

Abstract

Clustering can be a valuable tool for analyzing large amounts of data, but anyone who clusters must choose how many item clusters, K, to report. Unfortunately, one must guess at K or some related parameter when working within each of the three available frameworks where one thinks of clustering: as a Euclidean distance problem; as a statistical model problem; or as a complexity theory problem. We report here a novel recursive square root heuristic, RSQRT, which accurately predicts K(reported) as a function of the attribute or item count, depending on attribute scales. We tested the heuristic on 226 widely-varying, but mostly scientific, studies, and found that the heuristic's K(best-predicted) rounded to exactly K(reported) in over half of the studies and was close in almost all of them. We claim that this strongly-supported heuristic makes sense and that, although it is not prescriptive, using it prospectively is much better than guessing.

Mesh：

Year: 2010 PMID： 21096553 DOI： 10.1109/IEMBS.2010.5627287

Source DB: PubMed Journal: Annu Int Conf IEEE Eng Med Biol Soc ISSN： 2375-7477

Keyword Cloud
Cited

1 in total

1. RSQRT: AN HEURISTIC FOR ESTIMATING THE NUMBER OF CLUSTERS TO REPORT.

Authors: John Carlis; Kelsey Bruso
Journal: Electron Commer Res Appl Date: 2012-03 Impact factor: 6.014

1 in total