鱼C论坛

 找回密码
 立即注册
查看: 3143|回复: 5

[已解决]【爬虫】对于__doPostBack分页的网页爬取问题

[复制链接]
发表于 2018-2-17 02:17:52 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能^_^

您需要 登录 才可以下载或查看,没有账号?立即注册

x
本帖最后由 脑子 于 2018-2-17 02:17 编辑

准备爬取一个新闻网页:url=http://www.science-weekly.cn/MoreList.aspx?id=1

它用__doPostBack分页:

我上网找了下解决方案,找到这篇文章:[python]利用urllib+urllib2解决爬虫分页翻页问题
链接>>>http://www.360doc.com/content/17/1005/13/43284313_692385292.shtml

我根据它的方法找到了post的键值,但是value很奇怪:

01.png
  1. <input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUKMTUzMTEwMzM5Mg9kFgICAw9kFgICBQ9kFgRmDzwrAAsBAA8WCh4MRGF0YUtleUZpZWxkBQJpZB4IRGF0YUtleXMWHgK5FQKUFQLWFAKxFALcEgK5EgKSEgLrEQLqEQKkEQL5EALTEAKZDwKFDwLyDgLWDgK5DgKgDgKLDgLXDQK3DQKnDQKIDQLaDAKpDAKiDALSCwLCCwK4CwKnCx4LXyFJdGVtQ291bnQCHh4JUGFnZUNvdW50AgEeFV8hRGF0YVNvdXJjZUl0ZW1Db3VudAIeZBYCZg9kFjwCAQ9kFgJmD2QWAmYPFQMcL3NraHRtbG5ld3MvMjAxNS8xLzI3NDUuaHRtbC3lrabmnK/mnJ/liIrku5jotLnmqKHlvI/pnIDmioDmnK/luILlnLrlgJLpgLwKMjAxNS0wMS0wNWQCAg9kFgJmD2QWAmYPFQMdL3NraHRtbG5ld3MvMjAxNC8xMi8yNzA4Lmh0bWwk56CU56m25omA5YiG57G75pS56Z2p5Li65Yib5paw562R6LevCjIwMTQtMTItMDNkAgMPZBYCZg9kFgJmDxUDHS9za2h0bWxuZXdzLzIwMTQvMTAvMjY0Ni5odG1sHuenkeWtpueahOagh+WHhuS4jeiDveS4luS/l+WMlgoyMDE0LTEwLTA4ZAIED2QWAmYPZBYCZg8VAxwvc2todG1sbmV3cy8yMDE0LzkvMjYwOS5odG1sJ+i9rOWfuuWboOKAnOS4iei+k+KAneagvOWxgOS6n+W+heegtOinowoyMDE0LTA5LTAxZAIFD2QWAmYPZBYCZg8VAxwvc2todG1sbmV3cy8yMDE0LzQvMjM5Ni5odG1sJOaJk+mAmuenkeaKgOS9k+WItuKAnOS7u+edo+S6jOiEieKAnQoyMDE0LTA0LTAzZAIGD2QWAmYPZBYCZg8VAxwvc2todG1sbmV3cy8yMDE0LzMvMjM2MS5odG1sHumbvumcvumUgeWfju+8jOivpeWQrOiwgeeahO+8nwoyMDE0LTAzLTExZAIHD2QWAmYPZBYCZg8VAxwvc2todG1sbmV3cy8yMDE0LzIvMjMyMi5odG1sHueUqOWIq+S6uueahOecvOedm+WuoeinhuiHquW3sQoyMDE0LTAyLTEyZAIID2QWAmYPZBYCZg8VAxwvc2todG1sbmV3cy8yMDE0LzEvMjI4My5odG1sFeenkeWtpueahOWbveWutuWxnuaApwoyMDE0LTAxLTA2ZAIJD2QWAmYPZBYCZg8VAx0vc2todG1sbmV3cy8yMDEzLzExLzIyODIuaHRtbCfni6znibnnmoTngbXprYLmiY3og73pgKDlsLHkuIDmtYHlpKflraYKMjAxMy0xMS0yN2QCCg9kFgJmD2QWAmYPFQMcL3NraHRtbG5ld3MvMjAxMy85LzIyMTIuaHRtbDDlj5bmtojmlofnkIbliIbnp5HmmK/mj5DljYfnp5HlrabntKDlhbvnmoTlhbPplK4KMjAxMy0wOS0yOWQCCw9kFgJmD2QWAmYPFQMcL3NraHRtbG5ld3MvMjAxMy85LzIxNjkuaHRtbCHku5bku6zkuLrku4DkuYjkuI3nm7jkv6Hnp5HlrabvvJ8KMjAxMy0wOS0xMmQCDA9kFgJmD2QWAmYPFQMcL3NraHRtbG5ld3MvMjAxMy84LzIxMzEuaHRtbCLkvZXku6XnoLTop6PigJzmioDmnK/mgZDmg6fnl4figJ0gCjIwMTMtMDgtMDFkAg0PZBYCZg9kFgJmDxUDHC9za2h0bWxuZXdzLzIwMTMvMi8xOTQ1Lmh0bWwV6L+O5paw5LiO56eR5a2m55uY54K5CjIwMTMtMDItMDZkAg4PZBYCZg9kFgJmDxUDHS9za2h0bWxuZXdzLzIwMTIvMTIvMTkyNS5odG1sG+KAnOWNg+S6uuKAneeahOWOhuWPsuWdkOaghwoyMDEyLTEyLTI1ZAIPD2QWAmYPZBYCZg8VAx0vc2todG1sbmV3cy8yMDEyLzExLzE5MDYuaHRtbB7pobXlsqnmsJTpnanlkb3nmoTnp5HlrabpgLvovpEKMjAxMi0xMS0xNmQCEA9kFgJmD2QWAmYPFQMdL3NraHRtbG5ld3MvMjAxMi8xMC8xODc4Lmh0bWwb6LWw5ZCR5rex5rW36aG756eR5oqA57uZ5YqbCjIwMTItMTAtMTVkAhEPZBYCZg9kFgJmDxUDHC9za2h0bWxuZXdzLzIwMTIvOS8xODQ5Lmh0bWwb4oCc5Zu956eR5aSn4oCd55qE5paw5L2/5ZG9CjIwMTItMDktMjRkAhIPZBYCZg9kFgJmDxUDHC9za2h0bWxuZXdzLzIwMTIvOC8xODI0Lmh0bWwn56eR5a2m5piv6Leo6LaK5YiG5q2n55qE5pyA5aSn5YWs57qm5pWwCjIwMTItMDgtMTVkAhMPZBYCZg9kFgJmDxUDHC9za2h0bWxuZXdzLzIwMTIvNy8xODAzLmh0bWwb6LWw5ZCR5byA5pS+5LiO5Y2P5ZCM5Yib5pawCjIwMTItMDctMTNkAhQPZBYCZg9kFgJmDxUDHC9za2h0bWxuZXdzLzIwMTIvNS8xNzUxLmh0bWwS6YeN5aGR56eR5oqA5Lym55CGCjIwMTItMDUtMTVkAhUPZBYCZg9kFgJmDxUDHC9za2h0bWxuZXdzLzIwMTIvNC8xNzE5Lmh0bWwY54Of6I2J44CB56eR5a2m5LiO5pS/5rK7CjIwMTItMDQtMTdkAhYPZBYCZg9kFgJmDxUDHC9za2h0bWxuZXdzLzIwMTIvMy8xNzAzLmh0bWwV5LiN6Ieq55Sx77yM5peg5Yib5pawCjIwMTItMDMtMTlkAhcPZBYCZg9kFgJmDxUDHC9za2h0bWxuZXdzLzIwMTIvMi8xNjcyLmh0bWwP56eR5a2m55qE5Lu35YC8CjIwMTItMDItMTRkAhgPZBYCZg9kFgJmDxUDHC9za2h0bWxuZXdzLzIwMTIvMS8xNjI2Lmh0bWwb6YeN6L+U56eR5a2m5Ye65Y+R55qE5Zyw5pa5CjIwMTItMDEtMTVkAhkPZBYCZg9kFgJmDxUDHS9za2h0bWxuZXdzLzIwMTEvMTIvMTU3Ny5odG1sHueOr+Wig+mihuWvvOWKm+S4juekvuS8muWPkeWxlQoyMDExLTEyLTA2ZAIaD2QWAmYPZBYCZg8VAx0vc2todG1sbmV3cy8yMDExLzExLzE1NzAuaHRtbCHorqnkv6Hmga/lhazlvIDmuKDpgZPmm7TliqDpgJrnlYUKMjAxMS0xMS0xMGQCGw9kFgJmD2QWAmYPFQMdL3NraHRtbG5ld3MvMjAxMS8xMC8xNDkwLmh0bWwb6LCB5p2l55uR566h5a2m5pyv5LiN56uv77yfCjIwMTEtMTAtMTBkAhwPZBYCZg9kFgJmDxUDHS9za2h0bWxuZXdzLzIwMTEvMTAvMTQ3NC5odG1sKuS6jOWFg+e7j+a1juWvvOiHtOKAnOWvkumXqOmavuWHuui0teWtkOKAnQoyMDExLTEwLTEwZAIdD2QWAmYPZBYCZg8VAxwvc2todG1sbmV3cy8yMDExLzgvMTQ2NC5odG1sG+WKqOi9pui/veWwvuS4juWPkeWxlemAn+W6pgoyMDExLTA4LTA4ZAIeD2QWAmYPZBYCZg8VAxwvc2todG1sbmV3cy8yMDExLzcvMTQ0Ny5odG1sJOenkeeglOivmuS/oeS9k+ezu+W/hemhu+eLrOeri+e7n+S4gAoyMDExLTA3LTA4ZAIBDw8WBh4OQ3VzdG9tSW5mb1RleHQFX+iusOW9leaAu+aVsO+8mjxiPjExNjwvYj4g5oC76aG15pWw77yaPGI+NDwvYj4g5b2T5YmN5Li656ysPGZvbnQgY29sb3I9InJlZCI+PGI+MjwvYj48L2ZvbnQ+6aG1HhBDdXJyZW50UGFnZUluZGV4AgIeC1JlY29yZGNvdW50AnRkZGSqrrri7McHqkS66vhhSAuchQ9hsw==">
复制代码

???
我现在很迷,急求解?
请问这该怎么办

最佳答案
2018-2-19 13:19:01
你可以找一下他的js文件,他应该是某个地方进行的转化,只要找到转换方式,破解就简单了
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复

使用道具 举报

发表于 2018-2-17 02:27:24 | 显示全部楼层
这应该是加密了吧。
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2018-2-17 02:32:34 | 显示全部楼层
°蓝鲤歌蓝 发表于 2018-2-17 02:27
这应该是加密了吧。

加密了要怎么办呢
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2018-2-17 02:46:12 | 显示全部楼层
脑子 发表于 2018-2-17 02:32
加密了要怎么办呢

破解加密或者用selenium模仿浏览器爬取
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2018-2-17 02:46:46 | 显示全部楼层
脑子 发表于 2018-2-17 02:32
加密了要怎么办呢

破解加密或者通过selenium模仿浏览器爬取
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2018-2-19 13:19:01 From FishC Mobile | 显示全部楼层    本楼为最佳答案   
你可以找一下他的js文件,他应该是某个地方进行的转化,只要找到转换方式,破解就简单了
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

小黑屋|手机版|Archiver|鱼C工作室 ( 粤ICP备18085999号-1 | 粤公网安备 44051102000585号)

GMT+8, 2024-5-5 23:36

Powered by Discuz! X3.4

© 2001-2023 Discuz! Team.

快速回复 返回顶部 返回列表