|
|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
- #coding:utf8
- import requests
- import re
- ##反斜杠表示行连接符
- headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/5\
- 37.36 (KHTML, like Gecko) Chrome/42.0.2311.152 Safari/537.36'}
- html = requests.get('http://tieba.baidu.com/p/3879173501',headers = headers)
- ##伪装浏览器必备,终于解决!!
- html.encoding = 'utf-8'
- #编码转换神器段!!!!完美解决乱码问题
- ##print(html.text)
- #爬取标题
- title = re.search('<title>(.*?)</title>',html.text,re.S).group(1)
- print (title)
- vip = re.findall('src="(.*?)" pic_ext="jpeg"',html.text)
- x=0
- for each in vip:
- print(each)
- with open("%s.jpg" % x, "wb") as code:
- code.write(html.content)
- x+=1
- print("下载了%s个图片"% x)
复制代码
如图 ,也不知为啥,运行之后,图片损坏
爬取贴吧图片的,应该如何修改啊~~ |
|