|
发表于 2020-2-15 16:06:21
|
显示全部楼层
报的错是说请求超时,即这个网站无法访问,有几种原因:这个网站被墙了,你访问不了;或者你的网络有问题。
我换了一个网站,访问正常。另外你的正则表达式写得不严谨,我帮你改了一下。运行之前请你确认安装了requests模块
- import requests
- import re
- Headers = {
- 'user-agent': "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36"}
- response = requests.get(
- "https://www.meitulu.com/item/3289.html", headers=Headers)
- # print(response.request.headers)
- html = response.text
- # print(html)
- print("------------------------------------------------------------------------------------------------")
- s = re.findall('<img[^>]*?src=[\'"]([^\'"]*?jpg)[\'"][^>]*?>',
- html, flags=re.I | re.M | re.S)
- print(s)
复制代码
结果:
- [
- 'https://mtl.gzhuibei.com/css/logo.jpg',
- 'https://mtl.gzhuibei.com/images/img/3289/1.jpg',
- 'https://mtl.gzhuibei.com/images/img/3289/2.jpg',
- 'https://mtl.gzhuibei.com/images/img/3289/3.jpg',
- 'https://mtl.gzhuibei.com/images/img/3289/4.jpg',
- 'https://mtl.gzhuibei.com/images/img/10695/0.jpg',
- 'https://mtl.gzhuibei.com/images/img/16676/0.jpg',
- 'https://mtl.gzhuibei.com/images/img/16029/0.jpg',
- 'https://mtl.gzhuibei.com/images/img/16512/0.jpg',
- 'https://mtl.gzhuibei.com/images/img/6809/0.jpg',
- 'https://mtl.gzhuibei.com/images/img/17765/0.jpg',
- 'https://mtl.gzhuibei.com/images/img/17831/0.jpg',
- 'https://mtl.gzhuibei.com/images/img/17733/0.jpg'
- ]
复制代码 |
|