|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
本帖最后由 Pythonnewers 于 2020-5-6 17:13 编辑
- import requests
- import re
- import bs4
- headers = {
- "User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.14 Safari/537.36 Edg/83.0.478.13"
- }
- cookie = cookie
- def req(url):
- html = requests.get(url, headers=headers,
- cookies=cookie).content.decode("gbk")
- soup = bs4.BeautifulSoup(html, "html.parser")
- name = re.findall(
- '<strong class="vwmy"><a href="https://fishc.com.cn/space-uid-854121.html" target="_blank" title="访问我的空间">(.*?)</a></strong>', html, re.S)
- print("Your name:"+name[0])
- message = re.findall(
- '<em class="prompt_news_0"></em>(.*?)</a>', html, re.S)
- if message[0] != "消息":
- print(re.findall('消息((.*?))'))
- else:
- print("您最近没有消息")
- if __name__ == "__main__":
- req("https://fishc.com.cn/forum.php")
复制代码
cookie哪里为了隐私就不发了
也没啥子用,说不定哪个新人就看我的懂爬虫了呢 |
|