|
发表于 2021-5-12 16:22:11
|
显示全部楼层
给你写了一个txt版本的,xls版本的 你要是用 自己研究一下,
- import requests,re
- from lxml import etree
- f=open("record.txt","w",encoding="utf-8")
- for page in range(1,51):
- print("...............正在处理第 %d 页数据............"%page)
- url="https://www.findchips.com/parametric/Resistors/Fixed%20Resistors?sort=false&page="+str(page)
- res=requests.get(url)
- tree = etree.HTML(res.text)
- data = tree.xpath("//table[@id='j-parametric-table']/tbody/tr")
- for x in range (1,len(data)+1):
- col=data[x-1].xpath("./td")
- try:
- three=col[2].xpath("./a/div/text()")[0].strip()+" "
- except:
- three=""
- try:
- three+=col[2].xpath("./span/text()")[0].strip()
- except:
- three=three
- try:
- five=col[4].xpath("./a/text()")[0].strip()
- except:
- five=""
- try:
- seven=col[6].xpath("./text()")[1].strip()
- except:
- seven=""
- line=three+","+five+","+seven+","
- for y in range(8,len(col)+1):
- line+=re.sub(" +"," ",col[y-1].text.strip().replace("\n",""))+","
- f.write(line[:-1]+"\n")
- line=""
- f.close()
- print("ok")
复制代码 |
|