|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
本帖最后由 私はり 于 2020-7-6 20:17 编辑
废贴,版主删除
本帖最后由 Twilight6 于 2020-7-4 08:17 编辑
这样即可,你还要安装的库:
PIL 图片处理库:
- python -m pip install Pillow -i https://pypi.tuna.tsinghua.edu.cn/simple
复制代码
requests 模块:
- python -m pip install requests -i https://pypi.tuna.tsinghua.edu.cn/simple
复制代码
lxml 解析器:
- python -m pip install lxml -i https://pypi.tuna.tsinghua.edu.cn/simple
复制代码
- import requests
- from tkinter import *
- from lxml.etree import HTML
- from PIL import Image,ImageTk
- # 访问网页,爬取所需要的数据
- url = 'https://www.tianqijun.com/techan/doc/3711.html'
- headers = {
- 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'
- }
- count=1
- request = requests.get(url,headers=headers)
- html = HTML(request.text)
- content = html.xpath('//div[@class="detailText"]/p/text()')
- title = html.xpath('//p/strong/text()')
- img_url = html.xpath('//strong/img/@src')
- # 下载图片
- for i in range(len(img_url)):
- with open(title[i]+'.jpg','wb') as file:
- file.write(requests.get(img_url[i]).content)
- root=Tk()
- root.title('----文章爬取&图文并茂')
- # 设置滚动条
- sb = Scrollbar(root)
- sb.pack(side=RIGHT,fill=Y)
- # 设置文本
- text = Text(root,width=100,height=50,font=('SIMHEI',13),yscrollcommand=sb.set)
- temp = []
- for i in range(0,len(title)*2,2):
- temp.append(Image.open(title[i//2]+'.jpg'))
- temp.append(ImageTk.PhotoImage(temp[i]))
- text.insert(END,content[i//2]+'\n')
- text.tag_add(f'tag{i}','1.0',END)
- text.tag_config(f'tag{i}',justify=CENTER)
- text.image_create(END,image=temp[i+1],align=CENTER)
- text.insert(END,'\n')
- text.pack()
- sb.config(command=text.yview)
- mainloop()
复制代码
效果图片:
(, 下载次数: 0)
|
|