|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
下面的我想提取href中的内容 下面的列表保存在targets中 试过了for each in targets:each.get('hrref')不行
[<td class="classicLook0"><a href="detail.jsp?id=339884">
A0096
</a></td>, <td class="classicLook0">问答题</td>, <td class="classicLook0">-</td>, <td class="classicLook0">-</td>, <td class="classicLook0">-</td>, <td class="classicLook0"><a href="detail.jsp?id=339876">
A0088
</a></td>, <td class="classicLook0">问答题</td>, <td class="classicLook0">-</td>, <td class="classicLook0">-</td>, <td class="classicLook0">-</td>, <td class="classicLook0"><a href="detail.jsp?id=339868">
A0080
</a></td>, <td class="classicLook0">问答题</td>, <td class="classicLook0">-</td>, <td class="classicLook0">-</td>, <td class="classicLook0">-</td>, <td class="classicLook0"><a href="detail.jsp?id=339860">
A0072
</a></td>, <td class="classicLook0">问答题</td>, <td class="classicLook0">-</td>, <td class="classicLook0">-</td>, <td class="classicLook0">-</td>]
[<td class="classicLook1"><a href="detail.jsp?id=339883">
A0095
</a></td>, <td class="classicLook1">问答题</td>,, ]
import requests
import bs4
proxies = {"http": "127.0.0.1:1080", "https": "127.0.0.1:1080"}
headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3756.400 QQBrowser/10.5.4039.400',
'Cookie': 'JSESSIONID=B43CB1A6DA8EB51A0192BEED03A959D8.TM4; DWRSESSIONID=GdCgKTLhs$Pz*W*VaM0i2Fsyman; uudid=cms5e5b099a-6147-27a0-a5dd-4be724a0b523; SF_cookie_2=67313298; radius=180.91.162.174'}
session = requests.Session()
res = session.get('http://wlkc.jluzh.com/meol/personal.do?menuId=0',headers=headers)
print(res)
target_url='http://wlkc.jluzh.com/meol/common/question/questionbank/student/list.jsp?tagbug=client&cateId=27948&perm=3840&status=0&strStyle=new03'
tar_res =session.get(target_url,headers=headers)
soup = bs4.BeautifulSoup(tar_res.text,'html.parser')
targets = []
for i in range(8):
targets += soup.find_all("td", class_="classicLook"+str(i))
for each in targets:
if each.a == None:
continue
print(each.a.get('href'))
|
|