直接参考官方文档(https://huggingface.co/learn/nlp-course/chapter1/3)的代码:
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
classifier("I've been waiting for a HuggingFace course my whole life.")
发现代码不能正常直接运行
分析后发现需要Pytorch或者TensorFLow,以及下载sentiment-analysis默认的distilbert-base-uncased-finetuned-sst-2-english模型(可能是网络原因,不能直接下载和加载模型,所以需要下载模型到本地)。
1、安装CPU版本的Pytorch
pip install torch torchvision torchaudio -i https://pypi.tuna.tsinghua.edu.cn/simple
2、下载huggingface上的distilbert-base-uncased-finetuned-sst-2-english模型
(1)模型链接:https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english
(2)在模型链接页面Files and versions中下载必须的文件:
config.json
pytorch_model.bin
tokenizer_config.json
vocab.txt
3、下载好的模型进行本地加载,代码如下:
from transformers import DistilBertTokenizer, DistilBertForSequenceClassification
# 指定本地模型路径
local_model_path = "D:/transformer_models/distilbert-base-uncased-finetuned-sst-2-english"
# 从本地加载分词器和模型
tokenizer = DistilBertTokenizer.from_pretrained(local_model_path)
model = DistilBertForSequenceClassification.from_pretrained(local_model_path)
4、参考官方文档的代码如下:
from transformers import pipeline
# 创建 pipeline
classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
# 使用 classifier 进行情感分析
result1 = classifier("I've been waiting for a HuggingFace course my whole life.")
result2 = classifier("I think it is not a good idea.")
print(result1)
print(result2)
5、最后合并在一起,代码如下:
from transformers import DistilBertTokenizer, DistilBertForSequenceClassification
from transformers import pipeline
# 指定本地模型路径
local_model_path = "D:/transformer_models/distilbert-base-uncased-finetuned-sst-2-english"
# 从本地加载分词器和模型
tokenizer = DistilBertTokenizer.from_pretrained(local_model_path)
model = DistilBertForSequenceClassification.from_pretrained(local_model_path)
# 创建 pipeline
classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
# 使用 classifier 进行情感分析
result1 = classifier("I've been waiting for a HuggingFace course my whole life.")
result2 = classifier("I think it is not a good idea.")
print(result1)
print(result2)
6、最后得到输出:
[{'label': 'POSITIVE', 'score': 0.9598049521446228}]
[{'label': 'NEGATIVE', 'score': 0.9998024106025696}]