|
|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
代码如下,运行的话不报错,但是也没有东西输出来,中间加了几个print也是什么东西都没有!输出:Process finished with exit code 0 爬的豆瓣电影https://movie.douban.com/chart
- import requests
- import re
- import json
- heads = 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Mobile Safari/537.36'
- #获取网页
- def get_page(url):
- response = requests.get(url,heads)
- return response.text
- #解析网页
- def re_html(html):
- r = re.compile('<table.*?class="p1">(.*?)</span>.*?</table>',re.S)#正则暂时获取总评论人数
- item = re.findall(r,html)
- print(item)
- for i in item:
- yield {
- 'name':i[0],
- # 'start':i[1],
- # 'num':i[2]
- }
- #写入txt
- def writedata(context):
- with open('0.txt','a',encoding='utf-8') as f:
- f.write(json.dumps(context,ensure_ascii=False) + '\n')
- f.close()
- #主函数
- def main():
- url = 'https://movie.douban.com/chart/'
- html = get_page(url)
- for x in re_html(html):
- print (x)
- writedata(x)
- if __name__ == '__mian__':
- main()
复制代码
|
|