爬虫问题。单个链接能爬取出来，使用for循环，结果是空列表

wk934530 · 发表于 2022-4-9 22:40:32

您需要登录才可以下载或查看，没有账号？立即注册

x

ls = ['http://lwj.sanya.gov.cn/wljsite/ydtj/list2.shtml/wljsite/ydtj/202203/b43442ca2f8c4ca5a9f8d10186b8a8f1.shtml',
'http://lwj.sanya.gov.cn/wljsite/ydtj/list2.shtml/wljsite/ydtj/202202/d7498541d147437ea352be69f0080ded.shtml',
'http://lwj.sanya.gov.cn/wljsite/ydtj/list2.shtml/wljsite/ydtj/202201/446aa96ff45c49bcb52a2facc02b6cd2.shtml']
def sylv(link):
r = requests.get(link,headers=headers)
r.encoding = 'utf-8'
html1 = etree.HTML(r.text)
link_xpath = html1.xpath('//*[@id="news_content"]/ucapcontent//text()')
#去除不必要字符
link_xpath = [el.replace('\r\n', '') for el in link_xpath]
print(link_xpath)
data_n = link_xpath
f = r"C:\Users\13783\Desktop\test.txt"
a = data_n
with open(f,"a") as file:
for i in range(len(a)):
file.write(str(a[i]) + "d" + " "+"\n")
file.write('*'*50)
for t in ls:
sylv(t)

复制代码

suchocolate · 发表于 2022-4-10 16:13:04

headers赋值了吗？代码发全

wk934530 · 发表于 2022-4-10 22:47:06

suchocolate 发表于 2022-4-10 16:13
headers赋值了吗？代码发全

你好，headers赋值了

suchocolate · 发表于 2022-4-12 13:06:06

wk934530 发表于 2022-4-10 22:47
你好，headers赋值了

没看到

账号		自动登录	找回密码
密码			立即注册