设为首页收藏本站

切换到窄版

鱼C论坛»论坛 › 编程语言专区 › Python交流 › 关于scrapy 爬虫

发新帖

查看: 709|回复: 1

关于scrapy 爬虫

发表于 2019-3-29 00:12:23 | 显示全部楼层 |阅读模式

马上注册，结交更多好友，享用更多功能^_^

您需要登录才可以下载或查看，没有账号？立即注册

x

最近在看小甲鱼关于scrapy的视频，到了最后一步发现导出来的信息只有两行，而不是都倒出来了，但是如果改成print title、link 和 desc 的时候是完整的。
有哪位高手可以告诉一下怎么改，或者是什么原因吗？

QQ20190329-001129@2x.png

QQ20190329-001129@2x.png

import scrapy
from tutorial.items import DmozItem
class DmozSpider(scrapy.Spider):
name = 'dmoz'
allowed_domain = ['dmoztools.net']
start_urls = [
'http://www.dmoztools.net/Computers/Programming/Languages/Python/Resources/',
'http://www.dmoztools.net/Computers/Programming/Languages/Python/Books/']
def parse(self,response):
# filename = response.url.split('/')[-2]
# with open(filename,'wb') as f:
# f.write(response.body)
sel = scrapy.selector.Selector(response)# 选择器
sites = sel.xpath('//section/div/div/div/div[@class="title-and-desc"]')
items = []
for site in sites:
item = DomzItem()
item['title'] = site.xpath('a/div/text()').extract()
item['link'] = site.xpath('a/@href').extract()
item['desc'] = site.xpath('div/text()').extract()
items.append(item)
return items

复制代码

小甲鱼最新课程 -> https://ilovefishc.com

回复

使用道具举报

楼主| 发表于 2019-3-29 00:14:09 | 显示全部楼层

要是有人在电脑上运行没有问题也可以上传交流一下哦

小甲鱼最新课程 -> https://ilovefishc.com

回复支持反对

使用道具举报

发新帖

小黑屋|手机版|Archiver|鱼C工作室 ( 粤ICP备18085999号-1 | 粤公网安备 44051102000585号)

GMT+8, 2026-3-31 19:51

Powered by Discuz! X3.4

© 2001-2023 Discuz! Team.

快速回复 返回顶部 返回列表