re.findall()匹配问题
本帖最后由 Py与C。。。 于 2021-8-4 11:14 编辑我用requests爬下来这样一段源码:
{"filters":[],"red":"花","ret_type":"poemline-multi","ret_array":[{"dynasty":["唐"],"type":["poemline"],"literature_author":["李白"],"name":["杨花落尽子规啼,闻道龙标过五溪"],"source_poem":["闻王昌龄左迁龙标遥有此寄"],"display_name":["杨花落尽子规啼,闻道龙标过五溪"],"sid":["40ef283d7ad94206b9fb18a11049b1b2"],"source_poem_sid":["d6f4091694a544d085602c9c4a08ab23"],"source_poem_body":["杨花落尽子规啼,闻道龙标过五溪。 我寄愁心与明月,随君直到夜郎西。"]},{"dynasty":["唐"],"type":["poemline"],"literature_author":["王维"],"name":["人闲桂花落,夜静春山空"],"source_poem":["鸟鸣涧"],"display_name":["人闲桂花落,夜静春山空"],"sid":["d252c73ce28144bcb9876da1129e81d3"],"source_poem_sid":["d6f4091694a544d085602c9c4a08ab23"],"source_poem_body":["杨花落尽子规啼,闻道龙标过五溪。 我寄愁心与明月,随君直到夜郎西。"]}],"extra":{"entity-num":62857,"return-num":20,"total-page":200},"ad":[],"highlight":"蹄"}
我想要display_name里面的内容,于是我用re.findall(r'"display_name":["(.*?)"]',html)来匹配,可匹配不到,返回的是空列表,加re.S也不行。
有谁知道这是怎么回事吗?麻烦帮忙解决一下{:10_266:} 本帖最后由 大马强 于 2021-8-4 11:20 编辑
https://static01.imgkr.com/temp/df0ac88cb50e45029aa5b012e8612574.jpg
中括号好像有特殊含义,用.*?忽略
正则网站 再给你一个正则网址https://regex101.com/r/fNsbUH/1/ 大马强 发表于 2021-8-4 11:18
中括号好像有特殊含义,用.*?忽略
用\转义一下就好了像这样re.findall(r'\"display_name\":\[\"(.*?)\"\]',html) data = {"filters":[],"red":"花","ret_type":"poemline-multi","ret_array":[{"dynasty":["唐"],"type":["poemline"],"literature_author":["李白"],"name":["杨花落尽子规啼,闻道龙标过五溪"],"source_poem":["闻王昌龄左迁龙标遥有此寄"],"display_name":["杨花落尽子规啼,闻道龙标过五溪"],"sid":["40ef283d7ad94206b9fb18a11049b1b2"],"source_poem_sid":["d6f4091694a544d085602c9c4a08ab23"],"source_poem_body":["杨花落尽子规啼,闻道龙标过五溪。 我寄愁心与明月,随君直到夜郎西。"]},{"dynasty":["唐"],"type":["poemline"],"literature_author":["王维"],"name":["人闲桂花落,夜静春山空"],"source_poem":["鸟鸣涧"],"display_name":["人闲桂花落,夜静春山空"],"sid":["d252c73ce28144bcb9876da1129e81d3"],"source_poem_sid":["d6f4091694a544d085602c9c4a08ab23"],"source_poem_body":["杨花落尽子规啼,闻道龙标过五溪。 我寄愁心与明月,随君直到夜郎西。"]}],"extra":{"entity-num":62857,"return-num":20,"total-page":200},"ad":[],"highlight":"蹄"}
print(data["ret_array"]["display_name"])
直接用字典的方法完不成吗 ? 1q23w31 发表于 2021-8-4 11:25
直接用字典的方法完不成吗 ?
他这个应该是reuqests获取的,需要用json转一下的,比如:
re = requests.post('https://abcdefg',json=data)
data=json.loads(re.content)
#或者用这样
#data=re.json() 2012277033 发表于 2021-8-4 11:23
用\转义一下就好了像这样
报错了{:10_277:}:
页:
[1]