本帖最后由 jody970214 于 2020-9-22 22:32 编辑
想用Python自己实现一下K-mean算法,代码如下:
def K_means(data, k = 10, iters = 100):
center = list(init_center(data, k))
while iters:
clusters = dict.fromkeys(list(range(k)),[])
for node in data:
print(node)
jud = []
center_new = []
dist = []
for cen in center:
dist.append(np.linalg.norm(node - cen))
print(dist)
##cluster = dist.index(min(dist))
##print(cluster)
##clusters[cluster].append(node)
##print(clusters)
for cluster in clusters.keys():
center_new.append(np.mean(clusters[cluster],axis = 0))
i = k
while i:
jud.append(all(center_new[i-1] == center[i-1]))
i -= 1
if all(jud):
break
center = center_new
print(center)
iters -= 1
return center
然后使用test = np.array([[0,0],[1,1],[5,3],[10,1],[0,1]])
测试,发现#标部分代码没按想象中的实现,结果如下:
[0 0][1.0, 0.0, 10.04987562112089]
1
{0: [array([0, 0])], 1: [array([0, 0])], 2: [array([0, 0])]}
为什么 clusters[cluster].append(node) 将值放入了所有键下面啊?不应该只放在clusters[1]下面吗?
本帖最后由 抉择啊 于 2020-9-23 12:13 编辑
补充:覆盖原本的值 dict1 = dict.fromkeys(list(range(4)))
for k in dict1:
dict1[k] = []
dict1[1].append('123')
for k in range(4):
print(k,dict1[k],id(dict1[k]))
|