# 如何确定聚类的数目_如何确定聚类数目

 The NUMCLUSTERS subcommand specifies the number of clusters into which the data will be partitioned. AUTO Automatic selection of the number of clusters. Under AUTO, you may specify a maximum number of possible clusters. TWOSTEP CLUSTER will search for the best number of clusters between 1 and the maximum using the criterion that you specify. The criterion for deciding the number of clusters can be either the Bayesian Information Criterion (BIC) or Akaike Information Criterion (AIC). TWOSTEP CLUSTER will find at least one cluster if the AUTO keyword is given. FIXED User-specified number of clusters. Specify a positive integer Examples TWOSTEP CLUSTER /CONTINUOUS VARIABLES = INCOME /CATEGORICAL VARIABLES = GENDER RACE /NUMCLUSTERS AUTO 10 AIC /PRINT SUMMARY COUNT. TWOSTEP CLUSTER uses the variables RACE, GENDER and INCOME for clustering. Specifications on the NUMCLUSTERS subcommand will instruct the procedure to automatically search for the number of clusters using the Akaike Information Criterion and require the answer to lie between 1 and 10. =================================================================== TWOSTEP CLUSTER /CONTINUOUS VARIABLES = INCOME /CATEGORICAL VARIABLES = RACE GENDER /NUMCLUSTERS FIXED 7 /PRINT SUMMARY COUNT. Here the procedure will find exactly seven clusters.

Ø1.任何类都必须在邻近各类中是突出的，即各类重心之间距离必须大；
Ø2.各类所包含的元素都不要过分地多；
Ø3.分类的数目应该符合使用的目的；
Ø4.若采用几种不同的聚类方法处理，则在各自的聚类结果上应该发现相同的类

1.就是先多做几组分类，比如说从5类—8类；
2.并比较这几类之间有无显著差异；
3.从中得到你可以接受的结论；

