|
10鱼币
本帖最后由 jody970214 于 2020-9-22 22:32 编辑
想用Python自己实现一下K-mean算法,代码如下:
- def K_means(data, k = 10, iters = 100):
- center = list(init_center(data, k))
- while iters:
- clusters = dict.fromkeys(list(range(k)),[])
- for node in data:
- print(node)
- jud = []
- center_new = []
- dist = []
- for cen in center:
- dist.append(np.linalg.norm(node - cen))
- print(dist)
- ##cluster = dist.index(min(dist))
- ##print(cluster)
- ##clusters[cluster].append(node)
- ##print(clusters)
- for cluster in clusters.keys():
- center_new.append(np.mean(clusters[cluster],axis = 0))
- i = k
- while i:
- jud.append(all(center_new[i-1] == center[i-1]))
- i -= 1
- if all(jud):
- break
- center = center_new
- print(center)
- iters -= 1
- return center
复制代码
然后使用 - test = np.array([[0,0],[1,1],[5,3],[10,1],[0,1]])
复制代码 测试,发现#标部分代码没按想象中的实现,结果如下:
[0 0][1.0, 0.0, 10.04987562112089]
1
{0: [array([0, 0])], 1: [array([0, 0])], 2: [array([0, 0])]}
为什么 clusters[cluster].append(node) 将值放入了所有键下面啊?不应该只放在clusters[1]下面吗?
本帖最后由 抉择啊 于 2020-9-23 12:13 编辑
补充:覆盖原本的值
- dict1 = dict.fromkeys(list(range(4)))
- for k in dict1:
- dict1[k] = []
- dict1[1].append('123')
- for k in range(4):
- print(k,dict1[k],id(dict1[k]))
复制代码
|
|