|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
本帖最后由 我叫学得会 于 2021-12-19 11:50 编辑
各位师傅们好,小弟最近在自学的途中尝试编写Python小工具,遇到问题,还请各位师傅指点迷津,拜谢
需求是:从 txt1 内容中筛选出状态码为200开头的网址,并且只将符合条件的URL地址写入新文件 txt2
txt1内容为:
- 200 147KB http://192.168.1.12:80/;admin/;admin/admin-post.html
- 200 147KB http://192.168.1.12:80/;admin/;admin/admin-logout.html
- 200 147KB http://192.168.1.12:80/;admin/;admin/admin-login.html
- 400 173B http://192.168.1.12:80/;admin/;admin/cgi-bin/test
- 200 147KB http://192.168.1.12:80/;admin/;admin/comment-admin.html
- 200 147KB http://192.168.1.12:80/;admin/;admin/list
- 200 736B http://192.168.1.12:80/;admin/;admin/robots.txt
- 200 69B http://192.168.1.12:80/;admin/;admin/swagger-resources
- 200 147KB http://192.168.1.12:80/;admin/;admin/swagger-ui.html
- 200 147KB http://192.168.1.12:80/;admin/;admin/thank-you.html
- 200 147KB http://192.168.1.12:80/;admin/;admin/tiki-admin.html
- 400 173B http://192.168.1.12:80/;admin/;login/test
- 400 0B http://192.168.1.12:80/;admin/;login/login.htm
- 400 0B http://192.168.1.12:80/;admin/;login/a%5c.aspx
- 200 147KB http://192.168.1.12:80/;admin/;login/admin-header.html
- 200 147KB http://192.168.1.12:80/;admin/;login/admin-logout.html
- 200 147KB http://192.168.1.12:80/;admin/;login/admin-post.html
- 200 147KB http://192.168.1.12:80/;admin/;login/admin-footer.html
- 200 147KB http://192.168.1.12:80/;admin/;login/admin-login.html
- 200 147KB http://192.168.1.12:80/;admin/;login/admin-functions.html
- 200 147KB http://192.168.1.12:80/;admin/;login/admin-odkazy.html
- 400 173B http://192.168.1.12:80/;admin/;login/cgi-bin/test
- 200 147KB http://192.168.1.12:80/;admin/;login/comment-admin.html
- 200 147KB http://192.168.1.12:80/;admin/;login/list
- 200 736B http://192.168.1.12:80/;admin/;login/robots.txt
- 200 69B http://192.168.1.12:80/;admin/;login/swagger-resources
- 200 147KB http://192.168.1.12:80/;admin/;login/swagger-ui.html
- 200 147KB http://192.168.1.12:80/;admin/;login/thank-you.html
- 200 147KB http://192.168.1.12:80/;admin/;login/tiki-admin.html
复制代码
我尝试的 search()方法和match()方法总是报错,求最优解!
- def Durl(rurl):
- findurl = re.compile(r"^200.*(?P<TEST>([a-zA-z]+://[^\s]*))$",re.S)
- resulin = findurl.search(rurl)
- resulin2 = resulin.group("TEST")
- print(resulin2)
- #print(resulin)
- # def Durl(rurl):
- # findurl = re.compile(r"^200(.*B)(?P<url>.*)$",re.S)
- # resulin = findurl.match(rurl)
- # FPurl = resulin.group("url")
- # print(FPurl)
复制代码
本帖最后由 WaitOtherCutely 于 2021-12-19 20:03 编辑
- import re
- # open("需读取的文件path") 例如 "D:\\a.txt"
- readed_f = open("D:\\a.txt")
- text = readed_f.read()
- readed_f.close()
- # open("存储至哪个文件path") 例如 "D:\\b.txt"
- with open("D:\\b.txt", 'w') as writed_f:
- writed_f.write("\n".join(re.findall(r"200.*\s(\S*://.*)", text)))
复制代码
如果此回答解决了您的疑问 别忘了设至最佳答案或已采纳呀~
|
|