|
|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
import jieba
import jieba.analyse
from gensim.test.utils import common_texts, get_tmpfile
from gensim.models import Word2Vec
#文件位置需要改为自己的存放路径
#将文本分词
with open(r'C:\Users\amgalang\Desktop\in_the_name_of_people.txt',encoding='utf-8') as f:
document = f.read()
document_cut = jieba.cut(document)
result = ' '.join(document_cut)
with open('./in_the_name_of_people_segment.txt', 'w',encoding="utf-8") as f2:
f2.write(result)
#加载语料
sentences = word2Vec.LineSentence('./in_the_name_of_people_segment.txt')
#训练语料
path = get_tmpfile("word2vec.model") #创建临时文件
model = word2Vec.Word2Vec(sentences, hs=1,min_count=1,window=10,size=100)
# model.save("word2vec.model")
# model = Word2Vec.load("word2vec.model")
#输入与“贪污”相近的100个词
for key in model.wv.similar_by_word('贪污', topn =100):
print(key)
代码跑完结果:
Building prefix dict from the default dictionary ...
Loading model from cache C:\Users\amgalang\AppData\Local\Temp\jieba.cache
Loading model cost 0.872 seconds.
Prefix dict has been built succesfully.
Traceback (most recent call last):
File "C:/Users/amgalang/Desktop/3.py", line 30, in <module>
sentences = word2Vec.LineSentence('./in_the_name_of_people_segment.txt')
NameError: name 'word2Vec' is not defined
希望处理报错,望大神们给予修改。 |
|