柿子饼同学 发表于 2021-2-28 15:36:57

B站爬取视频遇到的问题

现在想爬B站的视频了{:10_257:}
目标URL:传送门
然后踩点
但是我不知道怎么爬取视频{:10_266:}
那个画圈的网址也打不开,不知道视频在哪里
求具体过程,谢谢{:10_254:}

qiuyouzhi 发表于 2021-2-28 15:57:40

https://fishc.com.cn/thread-190000-1-1.html
仔细读下源代码,B站有API接口的

柿子饼同学 发表于 2021-2-28 16:15:17

qiuyouzhi 发表于 2021-2-28 15:57
https://fishc.com.cn/thread-190000-1-1.html
仔细读下源代码,B站有API接口的

API接口是什么

柿子饼同学 发表于 2021-2-28 16:18:10

柿子饼同学 发表于 2021-2-28 16:15
API接口是什么

然后BV号是在哪边看{:10_277:}

柿子饼同学 发表于 2021-2-28 16:19:06

柿子饼同学 发表于 2021-2-28 16:18
然后BV号是在哪边看

这个BV我知道了

Daniel_Zhang 发表于 2021-2-28 16:23:07

柿子饼同学 发表于 2021-2-28 16:15
API接口是什么

自行百度呗,百度那么多

柿子饼同学 发表于 2021-2-28 16:27:01

Daniel_Zhang 发表于 2021-2-28 16:23
自行百度呗,百度那么多

我知道了,可是为什么要把BV转成AV呢,AV不是早就不用了吗

Daniel_Zhang 发表于 2021-2-28 16:28:01

柿子饼同学 发表于 2021-2-28 16:27
我知道了,可是为什么要把BV转成AV呢,AV不是早就不用了吗

个人感觉是反爬机制?

好像还有一个 cid

柿子饼同学 发表于 2021-2-28 16:36:43

嗯我也找到密码表了

柿子饼同学 发表于 2021-2-28 16:40:01

Daniel_Zhang 发表于 2021-2-28 16:28
个人感觉是反爬机制?

好像还有一个 cid

我现在知道你写的啥了{:10_254:}
谢谢

柿子饼同学 发表于 2021-2-28 16:59:21

Daniel_Zhang 发表于 2021-2-28 16:28
个人感觉是反爬机制?

好像还有一个 cid

出问题了,是不是视频太大{:10_277:}=================== RESTART: C:/Users/86177/Desktop/Billi.py ===================
请输入视频BV或AV号:BV1QW411N762
准备下载【Web前端开发】《零基础入门学习Web开发》(HTML5&CSS3).flv...
开始下载...
Traceback (most recent call last):
File "C:/Users/86177/Desktop/Billi.py", line 77, in <module>
    DBV.main()
File "C:/Users/86177/Desktop/Billi.py", line 72, in main
    self.download_video(aid, cid, info['title'] + ".flv")
File "C:/Users/86177/Desktop/Billi.py", line 60, in download_video
    video = self.open_url(url)
File "C:/Users/86177/Desktop/Billi.py", line 9, in open_url
    res.encoding = res.apparent_encoding
File "C:\Users\86177\AppData\Local\Programs\Python\Python39-32\lib\site-packages\requests\models.py", line 730, in apparent_encoding
    return chardet.detect(self.content)['encoding']
File "C:\Users\86177\AppData\Local\Programs\Python\Python39-32\lib\site-packages\chardet\__init__.py", line 41, in detect
    detector.feed(byte_str)
File "C:\Users\86177\AppData\Local\Programs\Python\Python39-32\lib\site-packages\chardet\universaldetector.py", line 211, in feed
    if prober.feed(byte_str) == ProbingState.FOUND_IT:
File "C:\Users\86177\AppData\Local\Programs\Python\Python39-32\lib\site-packages\chardet\charsetgroupprober.py", line 71, in feed
    state = prober.feed(byte_str)
File "C:\Users\86177\AppData\Local\Programs\Python\Python39-32\lib\site-packages\chardet\hebrewprober.py", line 227, in feed
    byte_str = self.filter_high_byte_only(byte_str)
File "C:\Users\86177\AppData\Local\Programs\Python\Python39-32\lib\site-packages\chardet\charsetprober.py", line 63, in filter_high_byte_only
    buf = re.sub(b'([\x00-\x7F])+', b' ', buf)
File "C:\Users\86177\AppData\Local\Programs\Python\Python39-32\lib\re.py", line 210, in sub
    return _compile(pattern, flags).sub(repl, string, count)
MemoryError
>>>

柿子饼同学 发表于 2021-2-28 17:02:01

Daniel_Zhang 发表于 2021-2-28 16:28
个人感觉是反爬机制?

好像还有一个 cid

出错了,是不是容量太大
=================== RESTART: C:/Users/86177/Desktop/Billi.py ===================
请输入视频BV或AV号:BV1yA411H7w8
准备下载年轻人该不该留在京沪广深?网友:看完后恍然大悟!.flv...
开始下载...
Traceback (most recent call last):
File "C:/Users/86177/Desktop/Billi.py", line 77, in <module>
    DBV.main()
File "C:/Users/86177/Desktop/Billi.py", line 72, in main
    self.download_video(aid, cid, info['title'] + ".flv")
File "C:/Users/86177/Desktop/Billi.py", line 60, in download_video
    video = self.open_url(url)
File "C:/Users/86177/Desktop/Billi.py", line 9, in open_url
    res.encoding = res.apparent_encoding
File "C:\Users\86177\AppData\Local\Programs\Python\Python39-32\lib\site-packages\requests\models.py", line 730, in apparent_encoding
    return chardet.detect(self.content)['encoding']
File "C:\Users\86177\AppData\Local\Programs\Python\Python39-32\lib\site-packages\chardet\__init__.py", line 41, in detect
    detector.feed(byte_str)
File "C:\Users\86177\AppData\Local\Programs\Python\Python39-32\lib\site-packages\chardet\universaldetector.py", line 211, in feed
    if prober.feed(byte_str) == ProbingState.FOUND_IT:
File "C:\Users\86177\AppData\Local\Programs\Python\Python39-32\lib\site-packages\chardet\charsetgroupprober.py", line 71, in feed
    state = prober.feed(byte_str)
File "C:\Users\86177\AppData\Local\Programs\Python\Python39-32\lib\site-packages\chardet\hebrewprober.py", line 227, in feed
    byte_str = self.filter_high_byte_only(byte_str)
File "C:\Users\86177\AppData\Local\Programs\Python\Python39-32\lib\site-packages\chardet\charsetprober.py", line 63, in filter_high_byte_only
    buf = re.sub(b'([\x00-\x7F])+', b' ', buf)
File "C:\Users\86177\AppData\Local\Programs\Python\Python39-32\lib\re.py", line 210, in sub
    return _compile(pattern, flags).sub(repl, string, count)
MemoryError
>>>

qiuyouzhi 发表于 2021-2-28 17:08:58

柿子饼同学 发表于 2021-2-28 16:27
我知道了,可是为什么要把BV转成AV呢,AV不是早就不用了吗

接口要用AV号
页: [1]
查看完整版本: B站爬取视频遇到的问题