|
10鱼币
目前实现了单个页面的爬取,代码如下:
- import requests
- from bs4 import BeautifulSoup
- headers = {
- "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36"
- }
- r = requests.get('https://site.ip138.com/192.168.5.6/',headers=headers)
- html = r.text
- soup = BeautifulSoup(html,'lxml')
- for ul in soup.find_all(attrs={'id':'list'}):
- for a in ul.find_all(name='a'):
- print(a.string)
复制代码
现在有一个ip.txt,里面10个ip地址,需要返回这10个ip地址的查询内容并将结果输出为txt(每个ip替换192.168.5.6)
是要这样子?
- import requests
- from bs4 import BeautifulSoup
- headers = {
- "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36"
- }
- iplist = ['192.168.5.6', '192.168.5.5', '119.75.217.109']
- for i in iplist:
- url = f'https://site.ip138.com/{i}/'
- print(url)
- r = requests.get(url=url, headers=headers)
- html = r.text
- soup = BeautifulSoup(html, 'lxml')
- with open('ip.txt',mode='w',encoding='utf-8') as f:
- for ul in soup.find_all(attrs={'id': 'list'}):
- for a in ul.find_all(name='a'):
- print(a.string)
- f.write(a.string)
- f.write('\n')
复制代码
|
|