用Python做爬虫任务时,是否可以用BeautifulSoup取代正则表达式?
例如在下面的代码中,tags 似乎比正则表达式更好用:# Vollyball
# Data from https://en.volleyballworld.com/en/vnl/2019/women/resultsandranking/round1
from bs4 import BeautifulSoup
import requests
# import re
req = requests.get('https://en.volleyballworld.com/en/vnl/2019/women/resultsandranking/round1')
soup = BeautifulSoup(req.text, 'html.parser')
prePattern = soup.find_all('tr', 'group')
patternThrhd = 0# We take pattern as prePattern.
# Setting patternThrhd
for item in prePattern:
tds = item.find_all('td')
if tds['class'] == ['result--highlight']:
patternThrhd += 1
else:
break
pattern = prePattern# pattern is the list of matches.
dict = {}# The dictionary of matches. The keywords are the numbers.
# td = pattern.find_all('td')
# print(td.string)
example = pattern.find_all('td')
indices =
for item in pattern:
tds = item.find_all('td')
dict.string.replace('\r\n ', ''))] = \
tuple(tds.string.replace('\r\n ', '') for i in indices)
print(pattern)
能抓耗子的就是好猫 适合自己的就是最好 这个视情况而定吧,存在即合理 什么好用就用什么
页:
[1]