鱼C论坛

 找回密码
 立即注册
查看: 5276|回复: 29

[已解决]爬虫返回状态码521

[复制链接]
发表于 2020-11-18 22:21:19 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能^_^

您需要 登录 才可以下载或查看,没有账号?立即注册

x
这问题卡我一周了朋友们,救命啊!!!!!
网页就  https://www.yidaiyilu.gov.cn/xwzx/gnxw/87373.htm  这个网页
然后老规矩,上requests
i
mport requests

url = "https://www.yidaiyilu.gov.cn/xwzx/gnxw/87373.htm"
res = requests.get(url)
print(res.status_code)   #这里返回521


print(res.text)    
##################
<script>document.cookie=('_')+('_')+('j')+('s')+('l')+('_')+('c')+('l')+('e')+('a')+('r')+('a')+('n')+('c')+('e')+('_')+('s')+('=')+(-~[]+'')+(3+3+'')+(~~[]+'')+((1+[0])/[2]+'')+(3+4+'')+(~~{}+'')+(7+'')+(2+4+'')+(1+4+'')+(([2]+0>>2)+'')+('.')+(2+3+'')+(([2]+0>>2)+'')+(3+3+'')+('|')+('-')+(-~false+'')+('|')+('u')+('g')+('D')+('L')+('x')+('A')+('W')+('M')+('l')+('h')+('q')+((1<<3)+'')+('Z')+('g')+('s')+('%')+(2+'')+('B')+('D')+('f')+(-~[7]+'')+('r')+(-~1+'')+('J')+('z')+('R')+('f')+('C')+('s')+('%')+((2^1)+'')+('D')+(';')+('m')+('a')+('x')+('-')+('a')+('g')+('e')+('=')+(3+'')+(-~[5]+'')+(~~[]+'')+((+false)+'')+(';')+('p')+('a')+('t')+('h')+('=')+('/');location.href=location.pathname+location.search</script>

网上搜出来的结果都是带var 带function的,我这返回的是啥???
这可咋整???(另外,selenium也试过了,没成功,难受,想哭)
最佳答案
2020-11-19 17:03:27
闲来无事分析了一下,这个cookie生成方式不难,从浏览器分析来看,总共进行了三次请求,前两次都是为了生成cookie,最后一次才是正确响应:
1、第一次响应得到一段js代码,这段js代码执行后为浏览器添加了一个cookie;
<script>document.cookie=('_')+('_')+('j')+('s')+('l')+('_')+('c')+('l')+('e')+('a')+('r')+('a')+('n')+('c')+('e')+('_')+('s')+('=')+(-~false+'')+([2]*(3)+'')+(~~''+'')+(([2]+0>>2)+'')+(-~[6]+'')+((1+[2])/[2]+'')+(([2]+0>>2)+'')+(1+1+'')+((2^1)+'')+(-~[7]+'')+('.')+([2]*(3)+'')+(-~0+'')+(1+3+'')+('|')+('-')+((+true)+'')+('|')+('w')+('Q')+('z')+('u')+(-~false+'')+('M')+('i')+('b')+('l')+('V')+('B')+(4+5+'')+('e')+('K')+('b')+('%')+(1+1+'')+('B')+('V')+('o')+('J')+('B')+('y')+('A')+('Q')+('A')+((2)*[2]+'')+('h')+((1<<2)+'')+('%')+((1|2)+'')+('D')+(';')+('m')+('a')+('x')+('-')+('a')+('g')+('e')+('=')+(-~[2]+'')+(-~[5]+'')+(~~[]+'')+(~~{}+'')+(';')+('p')+('a')+('t')+('h')+('=')+('/');location.href=location.pathname+location.search</script>
分析及实现:利用正则将js代码取出,再利用execjs模块执行这段代码,得到一段名为__jsl_clearance_s的cookie(这个不是最终cookie值),同时获取该请求的set-cookie,作为下次请求附带使用
# 获取cookie参数jsluid
    jsluid = response.headers.get('set-cookie').split(';')[0]
    # 提取js代码
    js_clearance = re.findall('cookie=(.*?);location.href=', response.text)[0]
    # 执行后获得cookie参数js_clearance
    result = execjs.eval(js_clearance).split(';')[0]
2、第二次通过携带上一个请求得到的两个cookie参数请求并得到响应第二段js代码,这段js代码是经过混淆的代码,利用解混淆工具将js代码解混淆后代码如下;
function hash(_0x1b66b8) {
  function _0x35c6e5(_0x268dd8, _0xea5bd4) {
    return _0x268dd8 << _0xea5bd4 | _0x268dd8 >>> 32 - _0xea5bd4;
  }

  function _0x1eaf4b(_0x31d866, _0x14e06e) {
    var _0x157f3b, _0x51ff9a, _0x2bf573, _0x434e16, _0x3f57f0;

    _0x2bf573 = _0x31d866 & 2147483648;
    _0x434e16 = _0x14e06e & 2147483648;
    _0x157f3b = _0x31d866 & 1073741824;
    _0x51ff9a = _0x14e06e & 1073741824;
    _0x3f57f0 = (_0x31d866 & 1073741823) + (_0x14e06e & 1073741823);

    if (_0x157f3b & _0x51ff9a) {
      return _0x3f57f0 ^ 2147483648 ^ _0x2bf573 ^ _0x434e16;
    }

    if (_0x157f3b | _0x51ff9a) {
      if (_0x3f57f0 & 1073741824) {
        return _0x3f57f0 ^ 3221225472 ^ _0x2bf573 ^ _0x434e16;
      } else {
        return _0x3f57f0 ^ 1073741824 ^ _0x2bf573 ^ _0x434e16;
      }
    } else {
      return _0x3f57f0 ^ _0x2bf573 ^ _0x434e16;
    }
  }

  function _0x296d1d(_0x3ec120, _0x19f2dd, _0x5c9060) {
    return _0x3ec120 & _0x19f2dd | ~_0x3ec120 & _0x5c9060;
  }

  function _0x2e22ab(_0x1b4bee, _0x5b3ded, _0x1f786e) {
    return _0x1b4bee & _0x1f786e | _0x5b3ded & ~_0x1f786e;
  }

  function _0x9c1e12(_0x583030, _0x2fb4b0, _0x1e223e) {
    return _0x583030 ^ _0x2fb4b0 ^ _0x1e223e;
  }

  function _0x21943e(_0x507d21, _0x593ceb, _0x12d837) {
    return _0x593ceb ^ (_0x507d21 | ~_0x12d837);
  }

  function _0x30c4a8(_0x11c9c5, _0x2d92d7, _0x5443b6, _0xf48f8, _0x224d79, _0x640128, _0x4788bf) {
    _0x11c9c5 = _0x1eaf4b(_0x11c9c5, _0x1eaf4b(_0x1eaf4b(_0x296d1d(_0x2d92d7, _0x5443b6, _0xf48f8), _0x224d79), _0x4788bf));
    return _0x1eaf4b(_0x35c6e5(_0x11c9c5, _0x640128), _0x2d92d7);
  }

  function _0x2145f8(_0x53d7e0, _0xf63c6, _0x1eddd0, _0x5af86a, _0x4e89ac, _0x42dfbd, _0x4e866b) {
    _0x53d7e0 = _0x1eaf4b(_0x53d7e0, _0x1eaf4b(_0x1eaf4b(_0x2e22ab(_0xf63c6, _0x1eddd0, _0x5af86a), _0x4e89ac), _0x4e866b));
    return _0x1eaf4b(_0x35c6e5(_0x53d7e0, _0x42dfbd), _0xf63c6);
  }

  function _0x311b76(_0x39b6f5, _0x5a7109, _0x3a29c6, _0x4fb375, _0xcadb59, _0x508c0e, _0x234182) {
    _0x39b6f5 = _0x1eaf4b(_0x39b6f5, _0x1eaf4b(_0x1eaf4b(_0x9c1e12(_0x5a7109, _0x3a29c6, _0x4fb375), _0xcadb59), _0x234182));
    return _0x1eaf4b(_0x35c6e5(_0x39b6f5, _0x508c0e), _0x5a7109);
  }

  function _0x361b6d(_0x3d0c62, _0x300099, _0x537e35, _0x6f09e1, _0x45e6a4, _0x1d7856, _0x2506bc) {
    _0x3d0c62 = _0x1eaf4b(_0x3d0c62, _0x1eaf4b(_0x1eaf4b(_0x21943e(_0x300099, _0x537e35, _0x6f09e1), _0x45e6a4), _0x2506bc));
    return _0x1eaf4b(_0x35c6e5(_0x3d0c62, _0x1d7856), _0x300099);
  }

  function _0x57b771(_0x3c91e6) {
    var _0x10f282;

    var _0xc362bc = _0x3c91e6["length"];

    var _0x41aff5 = _0xc362bc + 8;

    var _0x24fc0a = (_0x41aff5 - _0x41aff5 % 64) / 64;

    var _0x1c8987 = (_0x24fc0a + 1) * 16;

    var _0x281eac = Array(_0x1c8987 - 1);

    var _0x11a5cc = 0;
    var _0x3f48ef = 0;

    while (_0x3f48ef < _0xc362bc) {
      _0x10f282 = (_0x3f48ef - _0x3f48ef % 4) / 4;
      _0x11a5cc = _0x3f48ef % 4 * 8;
      _0x281eac[_0x10f282] = _0x281eac[_0x10f282] | _0x3c91e6["charCodeAt"](_0x3f48ef) << _0x11a5cc;
      _0x3f48ef++;
    }

    _0x10f282 = (_0x3f48ef - _0x3f48ef % 4) / 4;
    _0x11a5cc = _0x3f48ef % 4 * 8;
    _0x281eac[_0x10f282] = _0x281eac[_0x10f282] | 128 << _0x11a5cc;
    _0x281eac[_0x1c8987 - 2] = _0xc362bc << 3;
    _0x281eac[_0x1c8987 - 1] = _0xc362bc >>> 29;
    return _0x281eac;
  }

  function _0x1b0e3e(_0x1bc183) {
    var _0x2d342c = "",
        _0x486522 = "",
        _0x45875a,
        _0x2a3b5e;

    for (_0x2a3b5e = 0; _0x2a3b5e <= 3; _0x2a3b5e++) {
      _0x45875a = _0x1bc183 >>> _0x2a3b5e * 8 & 255;
      _0x486522 = "0" + _0x45875a["toString"](16);
      _0x2d342c = _0x2d342c + _0x486522["substr"](_0x486522["length"] - 2, 2);
    }

    return _0x2d342c;
  }

  var _0x198c42 = Array();

  var _0x556dd6, _0x3e947b, _0x217e9f, _0x8545c6, _0x3ed023, _0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7;

  var _0x5153e3 = 7,
      _0xa71763 = 12,
      _0x509ea6 = 17,
      _0x4288bc = 22;
  var _0x7ad2f4 = 5,
      _0x45d017 = 9,
      _0x41614c = 14,
      _0x354464 = 20;
  var _0x307adc = 4,
      _0x1cc902 = 11,
      _0x5bb242 = 16,
      _0x4ed1a4 = 23;
  var _0x418318 = 6,
      _0xb85eab = 10,
      _0x2a7231 = 15,
      _0x5cca29 = 21;
  _0x198c42 = _0x57b771(_0x1b66b8);
  _0x244b3a = 1732584193;
  _0x47e9c7 = 4023233417;
  _0x4f689f = 2562383102;
  _0x27bcf7 = 271733878;

  for (_0x556dd6 = 0; _0x556dd6 < _0x198c42["length"]; _0x556dd6 += 16) {
    _0x3e947b = _0x244b3a;
    _0x217e9f = _0x47e9c7;
    _0x8545c6 = _0x4f689f;
    _0x3ed023 = _0x27bcf7;
    _0x244b3a = _0x30c4a8(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 0], _0x5153e3, 3614090360);
    _0x27bcf7 = _0x30c4a8(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 1], _0xa71763, 3905402710);
    _0x4f689f = _0x30c4a8(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 2], _0x509ea6, 606105819);
    _0x47e9c7 = _0x30c4a8(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 3], _0x4288bc, 3250441966);
    _0x244b3a = _0x30c4a8(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 4], _0x5153e3, 4118548399);
    _0x27bcf7 = _0x30c4a8(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 5], _0xa71763, 1200080426);
    _0x4f689f = _0x30c4a8(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 6], _0x509ea6, 2821735955);
    _0x47e9c7 = _0x30c4a8(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 7], _0x4288bc, 4249261313);
    _0x244b3a = _0x30c4a8(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 8], _0x5153e3, 1770035416);
    _0x27bcf7 = _0x30c4a8(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 9], _0xa71763, 2336552879);
    _0x4f689f = _0x30c4a8(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 10], _0x509ea6, 4294925233);
    _0x47e9c7 = _0x30c4a8(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 11], _0x4288bc, 2304563134);
    _0x244b3a = _0x30c4a8(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 12], _0x5153e3, 1804603682);
    _0x27bcf7 = _0x30c4a8(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 13], _0xa71763, 4254626195);
    _0x4f689f = _0x30c4a8(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 14], _0x509ea6, 2792965006);
    _0x47e9c7 = _0x30c4a8(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 15], _0x4288bc, 1236535329);
    _0x244b3a = _0x2145f8(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 1], _0x7ad2f4, 4129170786);
    _0x27bcf7 = _0x2145f8(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 6], _0x45d017, 3225465664);
    _0x4f689f = _0x2145f8(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 11], _0x41614c, 643717713);
    _0x47e9c7 = _0x2145f8(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 0], _0x354464, 3921069994);
    _0x244b3a = _0x2145f8(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 5], _0x7ad2f4, 3593408605);
    _0x27bcf7 = _0x2145f8(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 10], _0x45d017, 38016083);
    _0x4f689f = _0x2145f8(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 15], _0x41614c, 3634488961);
    _0x47e9c7 = _0x2145f8(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 4], _0x354464, 3889429448);
    _0x244b3a = _0x2145f8(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 9], _0x7ad2f4, 568446438);
    _0x27bcf7 = _0x2145f8(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 14], _0x45d017, 3275163606);
    _0x4f689f = _0x2145f8(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 3], _0x41614c, 4107603335);
    _0x47e9c7 = _0x2145f8(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 8], _0x354464, 1163531501);
    _0x244b3a = _0x2145f8(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 13], _0x7ad2f4, 2850285829);
    _0x27bcf7 = _0x2145f8(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 2], _0x45d017, 4243563512);
    _0x4f689f = _0x2145f8(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 7], _0x41614c, 1735328473);
    _0x47e9c7 = _0x2145f8(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 12], _0x354464, 2368359562);
    _0x244b3a = _0x311b76(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 5], _0x307adc, 4294588738);
    _0x27bcf7 = _0x311b76(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 8], _0x1cc902, 2272392833);
    _0x4f689f = _0x311b76(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 11], _0x5bb242, 1839030562);
    _0x47e9c7 = _0x311b76(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 14], _0x4ed1a4, 4259657740);
    _0x244b3a = _0x311b76(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 1], _0x307adc, 2763975236);
    _0x27bcf7 = _0x311b76(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 4], _0x1cc902, 1272893353);
    _0x4f689f = _0x311b76(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 7], _0x5bb242, 4139469664);
    _0x47e9c7 = _0x311b76(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 10], _0x4ed1a4, 3200236656);
    _0x244b3a = _0x311b76(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 13], _0x307adc, 681279174);
    _0x27bcf7 = _0x311b76(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 0], _0x1cc902, 3936430074);
    _0x4f689f = _0x311b76(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 3], _0x5bb242, 3572445317);
    _0x47e9c7 = _0x311b76(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 6], _0x4ed1a4, 76029189);
    _0x244b3a = _0x311b76(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 9], _0x307adc, 3654602809);
    _0x27bcf7 = _0x311b76(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 12], _0x1cc902, 3873151461);
    _0x4f689f = _0x311b76(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 15], _0x5bb242, 530742520);
    _0x47e9c7 = _0x311b76(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 2], _0x4ed1a4, 3299628645);
    _0x244b3a = _0x361b6d(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 0], _0x418318, 4096336452);
    _0x27bcf7 = _0x361b6d(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 7], _0xb85eab, 1126891415);
    _0x4f689f = _0x361b6d(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 14], _0x2a7231, 2878612391);
    _0x47e9c7 = _0x361b6d(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 5], _0x5cca29, 4237533241);
    _0x244b3a = _0x361b6d(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 12], _0x418318, 1700485571);
    _0x27bcf7 = _0x361b6d(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 3], _0xb85eab, 2399980690);
    _0x4f689f = _0x361b6d(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 10], _0x2a7231, 4293915773);
    _0x47e9c7 = _0x361b6d(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 1], _0x5cca29, 2240044497);
    _0x244b3a = _0x361b6d(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 8], _0x418318, 1873313359);
    _0x27bcf7 = _0x361b6d(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 15], _0xb85eab, 4264355552);
    _0x4f689f = _0x361b6d(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 6], _0x2a7231, 2734768916);
    _0x47e9c7 = _0x361b6d(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 13], _0x5cca29, 1309151649);
    _0x244b3a = _0x361b6d(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 4], _0x418318, 4149444226);
    _0x27bcf7 = _0x361b6d(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 11], _0xb85eab, 3174756917);
    _0x4f689f = _0x361b6d(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 2], _0x2a7231, 718787259);
    _0x47e9c7 = _0x361b6d(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 9], _0x5cca29, 3951481745);
    _0x244b3a = _0x1eaf4b(_0x244b3a, _0x3e947b);
    _0x47e9c7 = _0x1eaf4b(_0x47e9c7, _0x217e9f);
    _0x4f689f = _0x1eaf4b(_0x4f689f, _0x8545c6);
    _0x27bcf7 = _0x1eaf4b(_0x27bcf7, _0x3ed023);
  }

  var _0x35900a = _0x1b0e3e(_0x244b3a) + _0x1b0e3e(_0x47e9c7) + _0x1b0e3e(_0x4f689f) + _0x1b0e3e(_0x27bcf7);

  return _0x35900a["toLowerCase"]();
}

function go(_0x30b50d) {
  function _0x3dbf67() {
    var _0x5f1114 = window["navigator"]["userAgent"],
        _0x2ed046 = ["Phantom"];

    for (var _0x1869b0 = 0; _0x1869b0 < _0x2ed046["length"]; _0x1869b0++) {
      if (_0x5f1114["indexOf"](_0x2ed046[_0x1869b0]) != -1) {
        return true;
      }
    }

    if (window["callPhantom"] || window["_phantom"] || window["Headless"] || window["navigator"]["webdriver"] || window["navigator"]["__driver_evaluate"] || window["navigator"]["__webdriver_evaluate"]) {
      return true;
    }
  }

  if (_0x3dbf67()) {
    return;
  }

  var _0x26a47f = new Date();

  function _0x3df5bc(_0x5da4a3, _0x2d77c8) {
    var _0xad821a = _0x30b50d["chars"]["length"];

    for (var _0x42a4ac = 0; _0x42a4ac < _0xad821a; _0x42a4ac++) {
      for (var _0x250ad6 = 0; _0x250ad6 < _0xad821a; _0x250ad6++) {
        var _0x5f1c4c = _0x2d77c8[0] + _0x30b50d["chars"]["substr"](_0x42a4ac, 1) + _0x30b50d["chars"]["substr"](_0x250ad6, 1) + _0x2d77c8[1];

        if (hash(_0x5f1c4c) == _0x5da4a3) {
          return [_0x5f1c4c, new Date() - _0x26a47f];
        }
      }
    }
  }

  var _0x1d6c97 = _0x3df5bc(_0x30b50d["ct"], _0x30b50d["bts"]);

  if (_0x1d6c97) {
    var _0x5c31f9;

    if (_0x30b50d["wt"]) {
      _0x5c31f9 = parseInt(_0x30b50d["wt"]) > _0x1d6c97[1] ? parseInt(_0x30b50d["wt"]) - _0x1d6c97[1] : 500;
    } else {
      _0x5c31f9 = 1500;
    }

    setTimeout(function () {
      document["cookie"] = _0x30b50d["tn"] + "=" + _0x1d6c97[0] + ";Max-age=" + _0x30b50d["vt"] + "; path = /";
      location["href"] = location["pathname"] + location["search"];
    }, _0x5c31f9);
  } else {
    alert("\u8BF7\u6C42\u9A8C\u8BC1\xE5\xA4\xB1\xE8\xB4\xA5");
  }
}

go({
  "bts": ["1605770555.059|0|DGK", "s4dADq0wDGWCiURT3yX7ds%3D"],
  "chars": "AdFF3xaKjaNVFXqbiTdKR4",
  "ct": "40ed0871cd9830417eda6370eef68d78",
  "ha": "md5",
  "tn": "__jsl_clearance_s",
  "vt": "3600",
  "wt": "1500"
});
分析及实现:简单解读后发现这段代码是调用了go方法并传入了一段参数,这段参数作用就是用于第二次生成cookie的,接下来就简单了,先利用正则将这段参数提取出来,再修改一下js代码;
下面这段代码目测应该是判断是否是爬虫用的,经过测试可以删除,不影响;
function _0x3dbf67() {
    var _0x5f1114 = window["navigator"]["userAgent"],
        _0x2ed046 = ["Phantom"];

    for (var _0x1869b0 = 0; _0x1869b0 < _0x2ed046["length"]; _0x1869b0++) {
      if (_0x5f1114["indexOf"](_0x2ed046[_0x1869b0]) != -1) {
        return true;
      }
    }

    if (window["callPhantom"] || window["_phantom"] || window["Headless"] || window["navigator"]["webdriver"] || window["navigator"]["__driver_evaluate"] || window["navigator"]["__webdriver_evaluate"]) {
      return true;
    }
  }

  if (_0x3dbf67()) {
    return;
  }
再将这段设置cookie的代码修改,调用go方法后直接返回cookie
//原代码
setTimeout(function () {
      document["cookie"] = _0x30b50d["tn"] + "=" + _0x1d6c97[0] + ";Max-age=" + _0x30b50d["vt"] + "; path = /";
      location["href"] = location["pathname"] + location["search"];
    }, _0x5c31f9);
  } else {
    alert("\u8BF7\u6C42\u9A8C\u8BC1\xE5\xA4\xB1\xE8\xB4\xA5");

//修改为
return _0x30b50d["tn"] + "=" + _0x1d6c97[0] + ";Max-age=" + _0x30b50d["vt"] + "; path = /";
最后删除js代码中的go方法调用,将js代码保存(另外需要注意的是网站第二次生成cookie的js代码有三种生成方式,需要用相同的方法将三段js代码分别修改保存);
先将之前得到的参数转为字典,再通过判断其中的参数ha,来使用对应的cookie生成代码
利用execjs模块传参执行js代码后得到最终的cookie,把前面已经获得的jsluid和最后得到的cookie参数携带去请求,得到响应正确内容。
import re
import execjs
import requests
import json
from requests.packages.urllib3.exceptions import InsecureRequestWarning
# 关闭ssl验证提示
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 '
                  '(KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36',
}
url = 'https://www.yidaiyilu.gov.cn/xwzx/gnxw/87373.htm'


def get_page():
    response = requests.get(url, headers=headers, verify=False)
    return response


def get_parameter(response):
    # 获取cookie参数jsluid
    jsluid = response.headers.get('set-cookie').split(';')[0]
    # 提取js代码
    js_clearance = re.findall('cookie=(.*?);location.href=', response.text)[0]
    # 执行后获得cookie参数js_clearance
    result = execjs.eval(js_clearance).split(';')[0]
    global headers
    headers.update({'cookie': jsluid + '; ' + result})
    response = get_page()
    # 提取参数并转字典
    parameter = json.loads(re.findall(r'};go\((.*?)\)</script>', response.text)[0])
    js_file = ''
    # 判断cookie生成方式
    if parameter['ha'] == 'sha1':
        js_file = 'sha1.js'
    elif parameter['ha'] == 'sha256':
        js_file = 'sha256.js'
    elif parameter['ha'] == 'md5':
        js_file = 'md5.js'
    return parameter, js_file, jsluid


def get_cookie(param, file):
    parameter = {
        "bts": param['bts'],
        "chars": param['chars'],
        "ct": param['ct'],
        "ha": param['ha'],
        "tn": param['tn'],
        "vt": param['vt'],
        "wt": param['wt']
    }
    with open(file, 'r') as f:
        js = f.read()
    cmp = execjs.compile(js)
    # 执行js代码传入参数
    clearance = cmp.call('go', parameter)
    return clearance


def run():
    response = get_page()
    parameter, js_file, jsluid = get_parameter(response)
    clearance = get_cookie(parameter, js_file)
    global headers
    headers.update({'cookie': jsluid + '; ' + clearance})
    html = requests.get(url, headers=headers, verify=False)
    print(html.content.decode())


run()
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复

使用道具 举报

发表于 2020-11-18 22:25:25 | 显示全部楼层


网站反爬虫,返回的内容被加密了

想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2020-11-18 22:41:58 | 显示全部楼层
Twilight6 发表于 2020-11-18 22:25
网站反爬虫,返回的内容被加密了

嗐,我知道,就是不知道咋整,
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2020-11-18 22:49:22 | 显示全部楼层
用编程搞垮道盟 发表于 2020-11-18 22:41
嗐,我知道,就是不知道咋整,




抱歉 ,我爬虫只会些基础,只知道返回的被加密了


想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2020-11-18 22:50:56 | 显示全部楼层
Twilight6 发表于 2020-11-18 22:49
抱歉 ,我爬虫只会些基础,只知道返回的被加密了

好吧,谢谢,也就你能理理我了嗐
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2020-11-18 22:53:22 | 显示全部楼层
用编程搞垮道盟 发表于 2020-11-18 22:50
好吧,谢谢,也就你能理理我了嗐



没事,蹲蹲看有没大佬~

想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2020-11-19 08:52:46 | 显示全部楼层
用编程搞垮道盟 发表于 2020-11-18 22:50
好吧,谢谢,也就你能理理我了嗐

加个Cookie应该就没问题了
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2020-11-19 08:53:50 | 显示全部楼层
反加密的话 大部分的用cookie或IP池基本上都能解决了
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2020-11-19 09:25:02 | 显示全部楼层
这个页面需要cookie包含两个参数:
__jsluid_s:从521的response的header里可以看到:
import requests

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Firefox/68.0'}
url = "https://www.yidaiyilu.gov.cn/xwzx/gnxw/87373.htm"
r = requests.get(url, headers=headers)
r.encoding = 'utf-8'
for k, v in r.cookies.items():
    print(k, '=', v)
__jsl_clearance_s:这个主要用js算出来,js不熟,找到一篇,可以参考一下:https://blog.csdn.net/qq_39138295/article/details/100705405

如果不想研究如何生成的,也可以直接用浏览器的:
import requests

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Firefox/68.0', 'cookie': '__jsluid_s=b37d6e95fe6d3b0d462eb76b2ab93002; __jsl_clearance_s=1605747177.992|0|EnIyB0bctVOOAs8Tklgeghe%2Bx44%3D'}
url = "https://www.yidaiyilu.gov.cn/xwzx/gnxw/87373.htm"
r = requests.get(url, headers=headers)
r.encoding = 'utf-8'
print(r.status_code)
print(r.text)

想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2020-11-19 09:37:20 | 显示全部楼层
楼上说的方法都是没有用的,我测试了下,这个是js加密

微信图片_20201119093615.png
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2020-11-19 09:49:04 | 显示全部楼层
笨鸟学飞 发表于 2020-11-19 09:37
楼上说的方法都是没有用的,我测试了下,这个是js加密

我这可以爬到啊 5.png
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2020-11-19 17:03:27 | 显示全部楼层    本楼为最佳答案   
闲来无事分析了一下,这个cookie生成方式不难,从浏览器分析来看,总共进行了三次请求,前两次都是为了生成cookie,最后一次才是正确响应:
1、第一次响应得到一段js代码,这段js代码执行后为浏览器添加了一个cookie;
<script>document.cookie=('_')+('_')+('j')+('s')+('l')+('_')+('c')+('l')+('e')+('a')+('r')+('a')+('n')+('c')+('e')+('_')+('s')+('=')+(-~false+'')+([2]*(3)+'')+(~~''+'')+(([2]+0>>2)+'')+(-~[6]+'')+((1+[2])/[2]+'')+(([2]+0>>2)+'')+(1+1+'')+((2^1)+'')+(-~[7]+'')+('.')+([2]*(3)+'')+(-~0+'')+(1+3+'')+('|')+('-')+((+true)+'')+('|')+('w')+('Q')+('z')+('u')+(-~false+'')+('M')+('i')+('b')+('l')+('V')+('B')+(4+5+'')+('e')+('K')+('b')+('%')+(1+1+'')+('B')+('V')+('o')+('J')+('B')+('y')+('A')+('Q')+('A')+((2)*[2]+'')+('h')+((1<<2)+'')+('%')+((1|2)+'')+('D')+(';')+('m')+('a')+('x')+('-')+('a')+('g')+('e')+('=')+(-~[2]+'')+(-~[5]+'')+(~~[]+'')+(~~{}+'')+(';')+('p')+('a')+('t')+('h')+('=')+('/');location.href=location.pathname+location.search</script>
分析及实现:利用正则将js代码取出,再利用execjs模块执行这段代码,得到一段名为__jsl_clearance_s的cookie(这个不是最终cookie值),同时获取该请求的set-cookie,作为下次请求附带使用
# 获取cookie参数jsluid
    jsluid = response.headers.get('set-cookie').split(';')[0]
    # 提取js代码
    js_clearance = re.findall('cookie=(.*?);location.href=', response.text)[0]
    # 执行后获得cookie参数js_clearance
    result = execjs.eval(js_clearance).split(';')[0]
2、第二次通过携带上一个请求得到的两个cookie参数请求并得到响应第二段js代码,这段js代码是经过混淆的代码,利用解混淆工具将js代码解混淆后代码如下;
function hash(_0x1b66b8) {
  function _0x35c6e5(_0x268dd8, _0xea5bd4) {
    return _0x268dd8 << _0xea5bd4 | _0x268dd8 >>> 32 - _0xea5bd4;
  }

  function _0x1eaf4b(_0x31d866, _0x14e06e) {
    var _0x157f3b, _0x51ff9a, _0x2bf573, _0x434e16, _0x3f57f0;

    _0x2bf573 = _0x31d866 & 2147483648;
    _0x434e16 = _0x14e06e & 2147483648;
    _0x157f3b = _0x31d866 & 1073741824;
    _0x51ff9a = _0x14e06e & 1073741824;
    _0x3f57f0 = (_0x31d866 & 1073741823) + (_0x14e06e & 1073741823);

    if (_0x157f3b & _0x51ff9a) {
      return _0x3f57f0 ^ 2147483648 ^ _0x2bf573 ^ _0x434e16;
    }

    if (_0x157f3b | _0x51ff9a) {
      if (_0x3f57f0 & 1073741824) {
        return _0x3f57f0 ^ 3221225472 ^ _0x2bf573 ^ _0x434e16;
      } else {
        return _0x3f57f0 ^ 1073741824 ^ _0x2bf573 ^ _0x434e16;
      }
    } else {
      return _0x3f57f0 ^ _0x2bf573 ^ _0x434e16;
    }
  }

  function _0x296d1d(_0x3ec120, _0x19f2dd, _0x5c9060) {
    return _0x3ec120 & _0x19f2dd | ~_0x3ec120 & _0x5c9060;
  }

  function _0x2e22ab(_0x1b4bee, _0x5b3ded, _0x1f786e) {
    return _0x1b4bee & _0x1f786e | _0x5b3ded & ~_0x1f786e;
  }

  function _0x9c1e12(_0x583030, _0x2fb4b0, _0x1e223e) {
    return _0x583030 ^ _0x2fb4b0 ^ _0x1e223e;
  }

  function _0x21943e(_0x507d21, _0x593ceb, _0x12d837) {
    return _0x593ceb ^ (_0x507d21 | ~_0x12d837);
  }

  function _0x30c4a8(_0x11c9c5, _0x2d92d7, _0x5443b6, _0xf48f8, _0x224d79, _0x640128, _0x4788bf) {
    _0x11c9c5 = _0x1eaf4b(_0x11c9c5, _0x1eaf4b(_0x1eaf4b(_0x296d1d(_0x2d92d7, _0x5443b6, _0xf48f8), _0x224d79), _0x4788bf));
    return _0x1eaf4b(_0x35c6e5(_0x11c9c5, _0x640128), _0x2d92d7);
  }

  function _0x2145f8(_0x53d7e0, _0xf63c6, _0x1eddd0, _0x5af86a, _0x4e89ac, _0x42dfbd, _0x4e866b) {
    _0x53d7e0 = _0x1eaf4b(_0x53d7e0, _0x1eaf4b(_0x1eaf4b(_0x2e22ab(_0xf63c6, _0x1eddd0, _0x5af86a), _0x4e89ac), _0x4e866b));
    return _0x1eaf4b(_0x35c6e5(_0x53d7e0, _0x42dfbd), _0xf63c6);
  }

  function _0x311b76(_0x39b6f5, _0x5a7109, _0x3a29c6, _0x4fb375, _0xcadb59, _0x508c0e, _0x234182) {
    _0x39b6f5 = _0x1eaf4b(_0x39b6f5, _0x1eaf4b(_0x1eaf4b(_0x9c1e12(_0x5a7109, _0x3a29c6, _0x4fb375), _0xcadb59), _0x234182));
    return _0x1eaf4b(_0x35c6e5(_0x39b6f5, _0x508c0e), _0x5a7109);
  }

  function _0x361b6d(_0x3d0c62, _0x300099, _0x537e35, _0x6f09e1, _0x45e6a4, _0x1d7856, _0x2506bc) {
    _0x3d0c62 = _0x1eaf4b(_0x3d0c62, _0x1eaf4b(_0x1eaf4b(_0x21943e(_0x300099, _0x537e35, _0x6f09e1), _0x45e6a4), _0x2506bc));
    return _0x1eaf4b(_0x35c6e5(_0x3d0c62, _0x1d7856), _0x300099);
  }

  function _0x57b771(_0x3c91e6) {
    var _0x10f282;

    var _0xc362bc = _0x3c91e6["length"];

    var _0x41aff5 = _0xc362bc + 8;

    var _0x24fc0a = (_0x41aff5 - _0x41aff5 % 64) / 64;

    var _0x1c8987 = (_0x24fc0a + 1) * 16;

    var _0x281eac = Array(_0x1c8987 - 1);

    var _0x11a5cc = 0;
    var _0x3f48ef = 0;

    while (_0x3f48ef < _0xc362bc) {
      _0x10f282 = (_0x3f48ef - _0x3f48ef % 4) / 4;
      _0x11a5cc = _0x3f48ef % 4 * 8;
      _0x281eac[_0x10f282] = _0x281eac[_0x10f282] | _0x3c91e6["charCodeAt"](_0x3f48ef) << _0x11a5cc;
      _0x3f48ef++;
    }

    _0x10f282 = (_0x3f48ef - _0x3f48ef % 4) / 4;
    _0x11a5cc = _0x3f48ef % 4 * 8;
    _0x281eac[_0x10f282] = _0x281eac[_0x10f282] | 128 << _0x11a5cc;
    _0x281eac[_0x1c8987 - 2] = _0xc362bc << 3;
    _0x281eac[_0x1c8987 - 1] = _0xc362bc >>> 29;
    return _0x281eac;
  }

  function _0x1b0e3e(_0x1bc183) {
    var _0x2d342c = "",
        _0x486522 = "",
        _0x45875a,
        _0x2a3b5e;

    for (_0x2a3b5e = 0; _0x2a3b5e <= 3; _0x2a3b5e++) {
      _0x45875a = _0x1bc183 >>> _0x2a3b5e * 8 & 255;
      _0x486522 = "0" + _0x45875a["toString"](16);
      _0x2d342c = _0x2d342c + _0x486522["substr"](_0x486522["length"] - 2, 2);
    }

    return _0x2d342c;
  }

  var _0x198c42 = Array();

  var _0x556dd6, _0x3e947b, _0x217e9f, _0x8545c6, _0x3ed023, _0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7;

  var _0x5153e3 = 7,
      _0xa71763 = 12,
      _0x509ea6 = 17,
      _0x4288bc = 22;
  var _0x7ad2f4 = 5,
      _0x45d017 = 9,
      _0x41614c = 14,
      _0x354464 = 20;
  var _0x307adc = 4,
      _0x1cc902 = 11,
      _0x5bb242 = 16,
      _0x4ed1a4 = 23;
  var _0x418318 = 6,
      _0xb85eab = 10,
      _0x2a7231 = 15,
      _0x5cca29 = 21;
  _0x198c42 = _0x57b771(_0x1b66b8);
  _0x244b3a = 1732584193;
  _0x47e9c7 = 4023233417;
  _0x4f689f = 2562383102;
  _0x27bcf7 = 271733878;

  for (_0x556dd6 = 0; _0x556dd6 < _0x198c42["length"]; _0x556dd6 += 16) {
    _0x3e947b = _0x244b3a;
    _0x217e9f = _0x47e9c7;
    _0x8545c6 = _0x4f689f;
    _0x3ed023 = _0x27bcf7;
    _0x244b3a = _0x30c4a8(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 0], _0x5153e3, 3614090360);
    _0x27bcf7 = _0x30c4a8(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 1], _0xa71763, 3905402710);
    _0x4f689f = _0x30c4a8(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 2], _0x509ea6, 606105819);
    _0x47e9c7 = _0x30c4a8(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 3], _0x4288bc, 3250441966);
    _0x244b3a = _0x30c4a8(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 4], _0x5153e3, 4118548399);
    _0x27bcf7 = _0x30c4a8(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 5], _0xa71763, 1200080426);
    _0x4f689f = _0x30c4a8(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 6], _0x509ea6, 2821735955);
    _0x47e9c7 = _0x30c4a8(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 7], _0x4288bc, 4249261313);
    _0x244b3a = _0x30c4a8(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 8], _0x5153e3, 1770035416);
    _0x27bcf7 = _0x30c4a8(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 9], _0xa71763, 2336552879);
    _0x4f689f = _0x30c4a8(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 10], _0x509ea6, 4294925233);
    _0x47e9c7 = _0x30c4a8(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 11], _0x4288bc, 2304563134);
    _0x244b3a = _0x30c4a8(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 12], _0x5153e3, 1804603682);
    _0x27bcf7 = _0x30c4a8(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 13], _0xa71763, 4254626195);
    _0x4f689f = _0x30c4a8(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 14], _0x509ea6, 2792965006);
    _0x47e9c7 = _0x30c4a8(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 15], _0x4288bc, 1236535329);
    _0x244b3a = _0x2145f8(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 1], _0x7ad2f4, 4129170786);
    _0x27bcf7 = _0x2145f8(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 6], _0x45d017, 3225465664);
    _0x4f689f = _0x2145f8(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 11], _0x41614c, 643717713);
    _0x47e9c7 = _0x2145f8(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 0], _0x354464, 3921069994);
    _0x244b3a = _0x2145f8(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 5], _0x7ad2f4, 3593408605);
    _0x27bcf7 = _0x2145f8(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 10], _0x45d017, 38016083);
    _0x4f689f = _0x2145f8(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 15], _0x41614c, 3634488961);
    _0x47e9c7 = _0x2145f8(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 4], _0x354464, 3889429448);
    _0x244b3a = _0x2145f8(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 9], _0x7ad2f4, 568446438);
    _0x27bcf7 = _0x2145f8(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 14], _0x45d017, 3275163606);
    _0x4f689f = _0x2145f8(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 3], _0x41614c, 4107603335);
    _0x47e9c7 = _0x2145f8(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 8], _0x354464, 1163531501);
    _0x244b3a = _0x2145f8(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 13], _0x7ad2f4, 2850285829);
    _0x27bcf7 = _0x2145f8(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 2], _0x45d017, 4243563512);
    _0x4f689f = _0x2145f8(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 7], _0x41614c, 1735328473);
    _0x47e9c7 = _0x2145f8(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 12], _0x354464, 2368359562);
    _0x244b3a = _0x311b76(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 5], _0x307adc, 4294588738);
    _0x27bcf7 = _0x311b76(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 8], _0x1cc902, 2272392833);
    _0x4f689f = _0x311b76(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 11], _0x5bb242, 1839030562);
    _0x47e9c7 = _0x311b76(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 14], _0x4ed1a4, 4259657740);
    _0x244b3a = _0x311b76(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 1], _0x307adc, 2763975236);
    _0x27bcf7 = _0x311b76(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 4], _0x1cc902, 1272893353);
    _0x4f689f = _0x311b76(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 7], _0x5bb242, 4139469664);
    _0x47e9c7 = _0x311b76(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 10], _0x4ed1a4, 3200236656);
    _0x244b3a = _0x311b76(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 13], _0x307adc, 681279174);
    _0x27bcf7 = _0x311b76(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 0], _0x1cc902, 3936430074);
    _0x4f689f = _0x311b76(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 3], _0x5bb242, 3572445317);
    _0x47e9c7 = _0x311b76(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 6], _0x4ed1a4, 76029189);
    _0x244b3a = _0x311b76(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 9], _0x307adc, 3654602809);
    _0x27bcf7 = _0x311b76(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 12], _0x1cc902, 3873151461);
    _0x4f689f = _0x311b76(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 15], _0x5bb242, 530742520);
    _0x47e9c7 = _0x311b76(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 2], _0x4ed1a4, 3299628645);
    _0x244b3a = _0x361b6d(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 0], _0x418318, 4096336452);
    _0x27bcf7 = _0x361b6d(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 7], _0xb85eab, 1126891415);
    _0x4f689f = _0x361b6d(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 14], _0x2a7231, 2878612391);
    _0x47e9c7 = _0x361b6d(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 5], _0x5cca29, 4237533241);
    _0x244b3a = _0x361b6d(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 12], _0x418318, 1700485571);
    _0x27bcf7 = _0x361b6d(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 3], _0xb85eab, 2399980690);
    _0x4f689f = _0x361b6d(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 10], _0x2a7231, 4293915773);
    _0x47e9c7 = _0x361b6d(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 1], _0x5cca29, 2240044497);
    _0x244b3a = _0x361b6d(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 8], _0x418318, 1873313359);
    _0x27bcf7 = _0x361b6d(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 15], _0xb85eab, 4264355552);
    _0x4f689f = _0x361b6d(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 6], _0x2a7231, 2734768916);
    _0x47e9c7 = _0x361b6d(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 13], _0x5cca29, 1309151649);
    _0x244b3a = _0x361b6d(_0x244b3a, _0x47e9c7, _0x4f689f, _0x27bcf7, _0x198c42[_0x556dd6 + 4], _0x418318, 4149444226);
    _0x27bcf7 = _0x361b6d(_0x27bcf7, _0x244b3a, _0x47e9c7, _0x4f689f, _0x198c42[_0x556dd6 + 11], _0xb85eab, 3174756917);
    _0x4f689f = _0x361b6d(_0x4f689f, _0x27bcf7, _0x244b3a, _0x47e9c7, _0x198c42[_0x556dd6 + 2], _0x2a7231, 718787259);
    _0x47e9c7 = _0x361b6d(_0x47e9c7, _0x4f689f, _0x27bcf7, _0x244b3a, _0x198c42[_0x556dd6 + 9], _0x5cca29, 3951481745);
    _0x244b3a = _0x1eaf4b(_0x244b3a, _0x3e947b);
    _0x47e9c7 = _0x1eaf4b(_0x47e9c7, _0x217e9f);
    _0x4f689f = _0x1eaf4b(_0x4f689f, _0x8545c6);
    _0x27bcf7 = _0x1eaf4b(_0x27bcf7, _0x3ed023);
  }

  var _0x35900a = _0x1b0e3e(_0x244b3a) + _0x1b0e3e(_0x47e9c7) + _0x1b0e3e(_0x4f689f) + _0x1b0e3e(_0x27bcf7);

  return _0x35900a["toLowerCase"]();
}

function go(_0x30b50d) {
  function _0x3dbf67() {
    var _0x5f1114 = window["navigator"]["userAgent"],
        _0x2ed046 = ["Phantom"];

    for (var _0x1869b0 = 0; _0x1869b0 < _0x2ed046["length"]; _0x1869b0++) {
      if (_0x5f1114["indexOf"](_0x2ed046[_0x1869b0]) != -1) {
        return true;
      }
    }

    if (window["callPhantom"] || window["_phantom"] || window["Headless"] || window["navigator"]["webdriver"] || window["navigator"]["__driver_evaluate"] || window["navigator"]["__webdriver_evaluate"]) {
      return true;
    }
  }

  if (_0x3dbf67()) {
    return;
  }

  var _0x26a47f = new Date();

  function _0x3df5bc(_0x5da4a3, _0x2d77c8) {
    var _0xad821a = _0x30b50d["chars"]["length"];

    for (var _0x42a4ac = 0; _0x42a4ac < _0xad821a; _0x42a4ac++) {
      for (var _0x250ad6 = 0; _0x250ad6 < _0xad821a; _0x250ad6++) {
        var _0x5f1c4c = _0x2d77c8[0] + _0x30b50d["chars"]["substr"](_0x42a4ac, 1) + _0x30b50d["chars"]["substr"](_0x250ad6, 1) + _0x2d77c8[1];

        if (hash(_0x5f1c4c) == _0x5da4a3) {
          return [_0x5f1c4c, new Date() - _0x26a47f];
        }
      }
    }
  }

  var _0x1d6c97 = _0x3df5bc(_0x30b50d["ct"], _0x30b50d["bts"]);

  if (_0x1d6c97) {
    var _0x5c31f9;

    if (_0x30b50d["wt"]) {
      _0x5c31f9 = parseInt(_0x30b50d["wt"]) > _0x1d6c97[1] ? parseInt(_0x30b50d["wt"]) - _0x1d6c97[1] : 500;
    } else {
      _0x5c31f9 = 1500;
    }

    setTimeout(function () {
      document["cookie"] = _0x30b50d["tn"] + "=" + _0x1d6c97[0] + ";Max-age=" + _0x30b50d["vt"] + "; path = /";
      location["href"] = location["pathname"] + location["search"];
    }, _0x5c31f9);
  } else {
    alert("\u8BF7\u6C42\u9A8C\u8BC1\xE5\xA4\xB1\xE8\xB4\xA5");
  }
}

go({
  "bts": ["1605770555.059|0|DGK", "s4dADq0wDGWCiURT3yX7ds%3D"],
  "chars": "AdFF3xaKjaNVFXqbiTdKR4",
  "ct": "40ed0871cd9830417eda6370eef68d78",
  "ha": "md5",
  "tn": "__jsl_clearance_s",
  "vt": "3600",
  "wt": "1500"
});
分析及实现:简单解读后发现这段代码是调用了go方法并传入了一段参数,这段参数作用就是用于第二次生成cookie的,接下来就简单了,先利用正则将这段参数提取出来,再修改一下js代码;
下面这段代码目测应该是判断是否是爬虫用的,经过测试可以删除,不影响;
function _0x3dbf67() {
    var _0x5f1114 = window["navigator"]["userAgent"],
        _0x2ed046 = ["Phantom"];

    for (var _0x1869b0 = 0; _0x1869b0 < _0x2ed046["length"]; _0x1869b0++) {
      if (_0x5f1114["indexOf"](_0x2ed046[_0x1869b0]) != -1) {
        return true;
      }
    }

    if (window["callPhantom"] || window["_phantom"] || window["Headless"] || window["navigator"]["webdriver"] || window["navigator"]["__driver_evaluate"] || window["navigator"]["__webdriver_evaluate"]) {
      return true;
    }
  }

  if (_0x3dbf67()) {
    return;
  }
再将这段设置cookie的代码修改,调用go方法后直接返回cookie
//原代码
setTimeout(function () {
      document["cookie"] = _0x30b50d["tn"] + "=" + _0x1d6c97[0] + ";Max-age=" + _0x30b50d["vt"] + "; path = /";
      location["href"] = location["pathname"] + location["search"];
    }, _0x5c31f9);
  } else {
    alert("\u8BF7\u6C42\u9A8C\u8BC1\xE5\xA4\xB1\xE8\xB4\xA5");

//修改为
return _0x30b50d["tn"] + "=" + _0x1d6c97[0] + ";Max-age=" + _0x30b50d["vt"] + "; path = /";
最后删除js代码中的go方法调用,将js代码保存(另外需要注意的是网站第二次生成cookie的js代码有三种生成方式,需要用相同的方法将三段js代码分别修改保存);
先将之前得到的参数转为字典,再通过判断其中的参数ha,来使用对应的cookie生成代码
利用execjs模块传参执行js代码后得到最终的cookie,把前面已经获得的jsluid和最后得到的cookie参数携带去请求,得到响应正确内容。
import re
import execjs
import requests
import json
from requests.packages.urllib3.exceptions import InsecureRequestWarning
# 关闭ssl验证提示
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 '
                  '(KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36',
}
url = 'https://www.yidaiyilu.gov.cn/xwzx/gnxw/87373.htm'


def get_page():
    response = requests.get(url, headers=headers, verify=False)
    return response


def get_parameter(response):
    # 获取cookie参数jsluid
    jsluid = response.headers.get('set-cookie').split(';')[0]
    # 提取js代码
    js_clearance = re.findall('cookie=(.*?);location.href=', response.text)[0]
    # 执行后获得cookie参数js_clearance
    result = execjs.eval(js_clearance).split(';')[0]
    global headers
    headers.update({'cookie': jsluid + '; ' + result})
    response = get_page()
    # 提取参数并转字典
    parameter = json.loads(re.findall(r'};go\((.*?)\)</script>', response.text)[0])
    js_file = ''
    # 判断cookie生成方式
    if parameter['ha'] == 'sha1':
        js_file = 'sha1.js'
    elif parameter['ha'] == 'sha256':
        js_file = 'sha256.js'
    elif parameter['ha'] == 'md5':
        js_file = 'md5.js'
    return parameter, js_file, jsluid


def get_cookie(param, file):
    parameter = {
        "bts": param['bts'],
        "chars": param['chars'],
        "ct": param['ct'],
        "ha": param['ha'],
        "tn": param['tn'],
        "vt": param['vt'],
        "wt": param['wt']
    }
    with open(file, 'r') as f:
        js = f.read()
    cmp = execjs.compile(js)
    # 执行js代码传入参数
    clearance = cmp.call('go', parameter)
    return clearance


def run():
    response = get_page()
    parameter, js_file, jsluid = get_parameter(response)
    clearance = get_cookie(parameter, js_file)
    global headers
    headers.update({'cookie': jsluid + '; ' + clearance})
    html = requests.get(url, headers=headers, verify=False)
    print(html.content.decode())


run()
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 2 反对 0

使用道具 举报

 楼主| 发表于 2020-11-19 23:04:17 | 显示全部楼层

??????你这怎么成功的amazing
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2020-11-19 23:14:27 | 显示全部楼层
YunGuo 发表于 2020-11-19 17:03
闲来无事分析了一下,这个cookie生成方式不难,从浏览器分析来看,总共进行了三次请求,前两次都是为了生成 ...

巨牛逼,谢谢大佬,不过二次的js代码后面我就不太懂了,还得参悟下
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2020-11-19 23:39:14 | 显示全部楼层
本帖最后由 用编程搞垮道盟 于 2020-11-19 23:40 编辑
YunGuo 发表于 2020-11-19 17:03
闲来无事分析了一下,这个cookie生成方式不难,从浏览器分析来看,总共进行了三次请求,前两次都是为了生成 ...


大佬那几个js文件(md5.js  sha1.js   sha256.js)是什么?另外就是那个go函数的调用怎么判断它该带的参数就是js代码最后那个go(())里面的东西的呀大佬不嫌弃的话就拜托解答一下啦
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2020-11-19 23:42:28 | 显示全部楼层
用编程搞垮道盟 发表于 2020-11-19 23:14
巨牛逼,谢谢大佬,不过二次的js代码后面我就不太懂了,还得参悟下

这段js代码也不是很复杂,就是要先把代码解混淆,不然就看不懂,既然知道这段代码是用来生成cookie的,那肯定就有为浏览器设置cookie的代码段(可以在代码中直接搜索cookie),找到这段代码就差不多成功了一半了,不需要知道js具体生成方式,将设置cookie的代码改为调用后返回结果就行了。只需要懂一点js就行。其他的就是需要注意网站有三套cookie生成方式,需要多次请求分别得到这三套js代码,至于怎样知道代码不同,可以通过代码最后的go方法传入的参数ha值判断使用的是哪一套代码,有三个值,分别是md5/sha1/sha256,代表三套不同的cookie生成代码。
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2020-11-19 23:54:20 | 显示全部楼层
YunGuo 发表于 2020-11-19 23:42
这段js代码也不是很复杂,就是要先把代码解混淆,不然就看不懂,既然知道这段代码是用来生成cookie的,那 ...

大佬用的啥解混淆工具?我咋解不出来呢啥cookie这个单词啥的在我这儿根本就还是看不懂的一堆代码
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2020-11-20 00:01:08 | 显示全部楼层
用编程搞垮道盟 发表于 2020-11-19 23:54
大佬用的啥解混淆工具?我咋解不出来呢啥cookie这个单词啥的在我这儿根本就还是看不懂的一堆代码

ob混淆专解测试版V0.1
http://tool.yuanrenxue.com/decode_obfuscator
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2020-11-20 00:26:38 | 显示全部楼层
用编程搞垮道盟 发表于 2020-11-19 23:54
大佬用的啥解混淆工具?我咋解不出来呢啥cookie这个单词啥的在我这儿根本就还是看不懂的一堆代码

一个在线解混淆的,发了网址还在审核。关于你问的怎么知道带的参数就是go里面的东西,很简单,首先懂一点js的就知道最后面的这个go就是调用了go方法并传入了一个参数(这个参数在js中的数据类型属于object对象),如果实在不确定,可以看看上面的go函数代码,这个函数传入的只有一个_0x30b50d,通过搜索_0x30b50d,就能发现第一次用到这个参数的地方var _0xad821a = _0x30b50d["chars"]["length"]; 这段代码定义了一个_0xad821a,并获取了_0x30b50d中chars的长度,你再回到代码最后看看,传入的参数中是不是有一个chars,你再继续找其他的也能找到,包括最后设置cookie的那段代码,也用到了这个参数_0x30b50d["tn"]、_0x30b50d["vt"],由此得知传入的肯定是生成cookie用到的参数。
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 1 反对 0

使用道具 举报

发表于 2020-11-20 00:34:43 | 显示全部楼层
YunGuo 发表于 2020-11-20 00:26
一个在线解混淆的,发了网址还在审核。关于你问的怎么知道带的参数就是go里面的东西,很简单,首先懂一点 ...

其实这个解混淆不完整,这段代码还是混淆的,不过不影响解读后面的设置cookie的代码,只是变量名和函数名是混淆的(每次得到的代码变量名和函数名不一样)
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 1 反对 0

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

小黑屋|手机版|Archiver|鱼C工作室 ( 粤ICP备18085999号-1 | 粤公网安备 44051102000585号)

GMT+8, 2024-12-27 11:53

Powered by Discuz! X3.4

© 2001-2023 Discuz! Team.

快速回复 返回顶部 返回列表