临时用户3.14159 发表于 2022-6-19 13:44:46

动态网页爬取问题

最近在学python的爬虫 可是在使用selenium爬取动态网页遇到了下面的错误(使用的webdriver是edge):

Traceback (most recent call last):
File "<pyshell#8>", line 1, in <module>
    dirver.get("网址")
File "D:\python\Python\Python310-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 442, in get
    self.execute(Command.GET, {'url': url})
File "D:\python\Python\Python310-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 430, in execute
    self.error_handler.check_response(response)
File "D:\python\Python\Python310-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 247, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: cannot determine loading status
from unknown error: missing or invalid columnNumber
(Session info: MicrosoftEdge=102.0.1245.44)
Stacktrace:
Backtrace:
        Microsoft::Applications::Events::EventProperties::unpack
        Microsoft::Applications::Events::ISemanticContext::SetTicket
        Microsoft::Applications::Events::ILogConfiguration::operator*
        Microsoft::Applications::Events::IModule::Teardown
        Microsoft::Applications::Events::IModule::Teardown
        Microsoft::Applications::Events::IModule::Teardown
        Microsoft::Applications::Events::IModule::Teardown
        Microsoft::Applications::Events::IModule::Teardown
        Microsoft::Applications::Events::IModule::Teardown
        Microsoft::Applications::Events::IModule::Teardown
        Microsoft::Applications::Events::IModule::Teardown
        Microsoft::Applications::Events::IModule::Teardown
        Microsoft::Applications::Events::ILogConfiguration::operator*
        Microsoft::Applications::Events::ILogConfiguration::operator*
        Microsoft::Applications::Events::IModule::Teardown
        Microsoft::Applications::Events::ILogConfiguration::operator*
        Microsoft::Applications::Events::GUID_t::GUID_t
        Microsoft::Applications::Events::GUID_t::GUID_t
        Microsoft::Applications::Events::GUID_t::GUID_t
        Microsoft::Applications::Events::GUID_t::GUID_t
        Microsoft::Applications::Events::ILogManager::DispatchEventBroadcast
        Microsoft::Applications::Events::ILogManager::DispatchEventBroadcast
        Microsoft::Applications::Events::ILogManager::DispatchEventBroadcast
        Microsoft::Applications::Events::ILogManager::DispatchEventBroadcast
        Microsoft::Applications::Events::ILogManager::DispatchEventBroadcast
        Microsoft::Applications::Events::ISemanticContext::SetTicket
        Microsoft::Applications::Events::ISemanticContext::SetTicket
        Microsoft::Applications::Events::ISemanticContext::SetTicket
        Microsoft::Applications::Events::ISemanticContext::SetTicket
        BaseThreadInitThunk
        RtlGetAppContainerNamedObjectPath
        RtlGetAppContainerNamedObjectPath

这句错误我认为大抵上就是关键:
selenium.common.exceptions.WebDriverException: Message: unknown error: cannot determine loading status
from unknown error: missing or invalid columnNumber

可是在网上也没查到相关信息 基本上是这段不一样missing or invalid columnNumber

希望有大佬可以帮忙解决一下

wp231957 发表于 2022-6-20 07:19:35

不发代码,比较奇葩
使用edge也比较少见

Lai013 发表于 2022-6-20 07:52:49

可能是由于网页还没有加载完元素你就请求了,
可以增加、延长等待时间试试

suchocolate 发表于 2022-6-20 15:10:16

driver.get('网址'),这个网址是不是应该替换成你要访问的网址?

临时用户3.14159 发表于 2022-6-20 19:23:18

suchocolate 发表于 2022-6-20 15:10
driver.get('网址'),这个网址是不是应该替换成你要访问的网址?

我这里将网址隐藏起来了

临时用户3.14159 发表于 2022-6-20 19:24:28

Lai013 发表于 2022-6-20 07:52
可能是由于网页还没有加载完元素你就请求了,
可以增加、延长等待时间试试

selenium的driver里的get函数不是会等到页面全部加载完后才会继续执行的吗

suchocolate 发表于 2022-6-20 22:13:29

临时用户3.14159 发表于 2022-6-20 19:24
selenium的driver里的get函数不是会等到页面全部加载完后才会继续执行的吗

并不是,想稳妥,还是得用selenium的ec
页: [1]
查看完整版本: 动态网页爬取问题