|

楼主 |
发表于 2023-2-5 19:43:03
|
显示全部楼层
前两个问题我大概明白了,但是第三个问题我还是不大明白
我这个有个案例
- import requests
- import json
- from lxml import html
- name = input("请输入关键词:")
- a = input("请输入搜索开始日期,如2022-10-10:")
- startTime = a+' 00:00:00'
- b = input("请输入搜索结束日期,如2022-12-10:")
- endTime = b+' 23:59:59'
- print("正在为您搜索请稍后...")
- url = 'https://www.cqggzy.com/interface/rest/esinteligentsearch/getFullTextDataNew'
- headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.70',
- 'Cookie':'cookie_www=36802747; __jsluid_s=712d8591293852446a2d196d57a069a2; Hm_lvt_3b83938a8721dadef0b185225769572a=1674978329,1675256803; Hm_lpvt_3b83938a8721dadef0b185225769572a=1675256803',
- 'Host': 'www.cqggzy.com',
- 'Referer': 'https://www.cqggzy.com/jyxx/transaction_detail.html'}
- data = '{"token":"","pn":0,"rn":9999,"sdt":"","edt":"","wd":"","inc_wd":"","exc_wd":"","fields":"","cnum":"001","sort":"{\"istop\":\"0\",\"ordernum\":\"0\",\"webdate\":\"0\",\"rowid\":\"0\"}","ssort":"","cl":10000,"terminal":"","condition":[{"fieldName":"categorynum","equal":"004","notEqual":null,"equalList":null,"notEqualList":["014001018","004002005","014001015","014005014","014008011"],"isLike":true,"likeType":2},{"fieldName":"titlenew","equal":"'+name+'","notEqual":null,"equalList":null,"notEqualList":null,"isLike":true,"likeType":0}],"time":[{"fieldName":"webdate","startTime":"'+startTime+'","endTime":"'+endTime+'"}],"highlights":"","statistics":null,"unionCondition":[],"accuracy":"","noParticiple":"1","searchRange":null,"noWd":true}'
- response = requests.post(url=url,data=data.encode('utf-8'),headers=headers)
- print(response)
- #etree=html.etree
- #etree.HTML(response)
- #print(etree)
- data_id = response.json()
复制代码
原始网址是https://www.cqggzy.com/jyxx/transaction_detail.html,Ajax请求得到的json数据,我需要的内容也在data_id里面,但是不知道如何处理数据json,因为我需要的用xpath都是get请求直接爬取网页元素,第一次遇到post请求,也不能用get请求里再导入etree.HTML
刚接触不久,不知道我这边讲清楚了没有 |
|