凌绝顶 发表于 2020-5-24 06:12:49

求怎样获取网页中的信息

请问各位鱼油们怎样获取网页中的信息呢
由于无权发送图片,我决定把网页内容复制过来
({"singer":[{"gsid":"67087","singer":"\u8521\u5f90\u5764","gspic":"http:\/\/aliyunimg.9ku.com\/9kuimg\/geshou\/20180320\/ffdec79af504b516.jpg?x-oss-process=image\/resize,m_fill,w_150,h_150,limit_0\/auto-orient,0"}],
"music":[{"id":"998960","mname":"\u91cd\u751f","gsid":"67087","singer":"\u8521\u5f90\u5764"},
{"id":"886000","mname":"\u6ca1\u6709\u610f\u5916","gsid":"67087","singer":"\u8521\u5f90\u5764"},
{"id":"877020","mname":"I Wanna Get Love","gsid":"67087","singer":"\u8521\u5f90\u5764"},
{"id":"881707","mname":"Wait Wait Wait","gsid":"67087","singer":"\u8521\u5f90\u5764"},
{"id":"1003660","mname":"Home","gsid":"67087","singer":"\u8521\u5f90\u5764"},
{"id":"1003003","mname":"\u5c71\u6cb3\u65e0\u6059\u5728\u6211\u80f8","gsid":"67087","singer":"\u8521\u5f90\u5764"},
{"id":"890578","mname":"Hard To Get","gsid":"67087","singer":"\u8521\u5f90\u5764"},
{"id":"888794","mname":"Bigger","gsid":"67087","singer":"\u8521\u5f90\u5764"},
{"id":"881332","mname":"You Can Be My GirlFriend","gsid":"67087","singer":"\u8521\u5f90\u5764"}],
"so":["\u8521\u5f90\u5764","vitas","star","\u561f\u554a\u561f\u554a","\u6d3b\u51fa\u57fa\u7763\u6b4c","\u6f02\u6d0b\u8fc7\u6d77\u6765\u770b\u4f60","\u535c\u5366","\u7231\u60c5\u4e70\u5356","\u4e16\u754c\u7b2c\u4e00\u7b49","\u542c\u5988\u5988\u7684\u8bdd","\u51b0\u6cb3\u65f6\u4ee3"]})
各位鱼油们,怎么获取所有的“id”值?

Twilight6 发表于 2020-5-24 07:20:03

本帖最后由 Twilight6 于 2020-5-24 07:31 编辑

用正则即可:
import re

str1 = r"""
({"singer":[{"gsid":"67087","singer":"\u8521\u5f90\u5764","gspic":"http:\/\/aliyunimg.9ku.com\/9kuimg\/geshou\/20180320\/ffdec79af504b516.jpg?x-oss-process=image\/resize,m_fill,w_150,h_150,limit_0\/auto-orient,0"}],
"music":[{"id":"998960","mname":"\u91cd\u751f","gsid":"67087","singer":"\u8521\u5f90\u5764"},
{"id":"886000","mname":"\u6ca1\u6709\u610f\u5916","gsid":"67087","singer":"\u8521\u5f90\u5764"},
{"id":"877020","mname":"I Wanna Get Love","gsid":"67087","singer":"\u8521\u5f90\u5764"},
{"id":"881707","mname":"Wait Wait Wait","gsid":"67087","singer":"\u8521\u5f90\u5764"},
{"id":"1003660","mname":"Home","gsid":"67087","singer":"\u8521\u5f90\u5764"},
{"id":"1003003","mname":"\u5c71\u6cb3\u65e0\u6059\u5728\u6211\u80f8","gsid":"67087","singer":"\u8521\u5f90\u5764"},
{"id":"890578","mname":"Hard To Get","gsid":"67087","singer":"\u8521\u5f90\u5764"},
{"id":"888794","mname":"Bigger","gsid":"67087","singer":"\u8521\u5f90\u5764"},
{"id":"881332","mname":"You Can Be My GirlFriend","gsid":"67087","singer":"\u8521\u5f90\u5764"}],
"so":["\u8521\u5f90\u5764","vitas","star","\u561f\u554a\u561f\u554a","\u6d3b\u51fa\u57fa\u7763\u6b4c","\u6f02\u6d0b\u8fc7\u6d77\u6765\u770b\u4f60","\u535c\u5366","\u7231\u60c5\u4e70\u5356","\u4e16\u754c\u7b2c\u4e00\u7b49","\u542c\u5988\u5988\u7684\u8bdd","\u51b0\u6cb3\u65f6\u4ee3"]})
"""

id = re.findall(r'"id":"(.+?)"',str1)
print(id)

xiaosi4081 发表于 2020-5-24 07:57:18

用json简白明了
页: [1]
查看完整版本: 求怎样获取网页中的信息