[已解决]求助！关于图形验证码的识别方法

飘花飞雪 · 发表于 2020-7-19 14:41:01

马上注册，结交更多好友，享用更多功能^_^

您需要登录才可以下载或查看，没有账号？立即注册

x

类似这种验证码，通过Tesseract-OCR，无论将图片怎么调整，都不能识别。

求大神告知好的图形验证码识别方法，因为需求网站的图形验证码很简单，所以希望获取验证码的方法越简单越好。

最佳答案

月排行榜 / 总排行榜

Hello.

2020-7-19 22:57:48

飘花飞雪发表于 2020-7-19 22:56
.point不也是二值化处理吗？你这是哪个模块的函数

好吧，可能没注意到，我无能为力了

跳转到最佳答案楼层

Hello. · 发表于 2020-7-19 14:45:42

传送门1
传送门2
传送门3（推荐）

飘花飞雪 · 发表于 2020-7-19 15:02:56

Hello. 发表于 2020-7-19 14:45
传送门1
传送门2
传送门3（推荐）

老哥，推荐的那个识别不了这个数字验证码

Hello. · 发表于 2020-7-19 15:03:26

飘花飞雪发表于 2020-7-19 15:02
老哥，推荐的那个识别不了这个数字验证码

还有俩

飘花飞雪 · 发表于 2020-7-19 15:06:26

Hello. 发表于 2020-7-19 14:45
传送门1
传送门2
传送门3（推荐）

调整了阈值已经很清晰了，但是识别不出来

飘花飞雪 · 发表于 2020-7-19 15:07:18

from PIL import Image
import pytesseract
import cv2
from urllib.request import urlretrieve
pytesseract.pytesseract.tesseract_cmd = r"F:\Program Files (x86)\Tesseract-OCR\tesseract.exe"
urlretrieve("http://wt.yygjj.com.cn/captcha.jpg","test.jpg")
image = Image.open("test.jpg")
image = image.convert("L")
threshold = 135
table = []
for i in range(256):
if i < threshold:
table.append(0)
else:
table.append(1)
# image = cv2.imread("test.jpg",cv2.IMREAD_GRAYSCALE)
# #
# cv2.imwrite('newimage.png', image)
#image = cv2.imread("C:/Users/fengz/Desktop/123.jpg",cv2.IMREAD_GRAYSCALE)
#gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# cv2.imshow("aaa",image)
# cv2.waitKey(0) # 单位毫秒
# #
# cv2.destroyAllWindows()
# cv2.destroyWindow("bug")
# cv2.imwrite('newimage.png', image)
#
image = image.point(table,"1")
image.show()
text = pytesseract.image_to_string(image)
#
print(text)

复制代码

Hello. · 发表于 2020-7-19 15:07:38

飘花飞雪发表于 2020-7-19 15:06
调整了阈值已经很清晰了，但是识别不出来

连打印都打印不出来？

飘花飞雪 · 发表于 2020-7-19 15:07:53

Hello. 发表于 2020-7-19 14:45
传送门1
传送门2
传送门3（推荐）

老哥，我粘贴了代码，你能帮忙运行试试吗？

飘花飞雪 · 发表于 2020-7-19 15:08:32

Hello. 发表于 2020-7-19 15:07
连打印都打印不出来？

打印不出来，CMD里面直接运行也识别不出来

Hello. · 发表于 2020-7-19 15:13:46

不行哈哈，你看下第二篇

飘花飞雪 · 发表于 2020-7-19 15:16:42

Hello. 发表于 2020-7-19 15:13
不行哈哈，你看下第二篇

我试试

飘花飞雪 · 发表于 2020-7-19 15:17:51

Hello. 发表于 2020-7-19 15:13
不行哈哈，你看下第二篇

和第三篇差不多，方法我都试了

Hello. · 发表于 2020-7-19 15:19:48

本帖最后由 Hello. 于 2020-7-19 15:22 编辑

好像没看到你二值化

img = binarizing(imgry,130) #二值化
复制代码

再看看这个

飘花飞雪 · 发表于 2020-7-19 22:56:50

Hello. 发表于 2020-7-19 15:19
好像没看到你二值化

.point不也是二值化处理吗？你这是哪个模块的函数

Hello. · 发表于 2020-7-19 22:57:48

这个最佳答案由 Hello. 给出，感谢 Hello. 的回答。

单击隐藏图章

飘花飞雪发表于 2020-7-19 22:56
.point不也是二值化处理吗？你这是哪个模块的函数

好吧，可能没注意到，我无能为力了

飘花飞雪 · 发表于 2020-7-19 22:59:20

Hello. 发表于 2020-7-19 15:19
好像没看到你二值化

这一篇我也看了，测试网站的图形码不做二值化也能读出来

Hello. · 发表于 2020-7-19 23:00:36

飘花飞雪发表于 2020-7-19 22:59
这一篇我也看了，测试网站的图形码不做二值化也能读出来

哈哈，二值化会更清晰
我有个处理图片的代码，你要不要看一看（虽然可能没啥用）

飘花飞雪 · 发表于 2020-7-19 23:01:42

好的，我参考一下

Hello. · 发表于 2020-7-19 23:01:53

# -*- coding:utf-8 -*-
import cv2
import numpy as np
from tkinter import filedialog, Tk
from os import getcwd
from re import findall
def open_path():
# 图片路径
root = Tk()
root.withdraw()
file_path = (filedialog.askopenfilename(title='选择图片文件', filetypes=[('All Files', '*')]))
return file_path
def dodgeNaive(image, mask):
# determine the shape of the input image
width, height = image.shape[:2]
# prepare output argument with same size as image
blend = np.zeros((width, height), np.uint8)
for col in range(width):
for row in range(height):
# do for every pixel
if mask[col, row] == 255:
# avoid division by zero
blend[col, row] = 255
else:
# shift image pixel value by 8 bits
# divide by the inverse of the mask
tmp = (image[col, row] << 8) / (255 - mask)
# print('tmp={}'.format(tmp.shape))
# make sure resulting value stays within bounds
if tmp.any() > 255:
tmp = 255
blend[col, row] = tmp
return blend
def dodgeV2(image, mask):
return cv2.divide(image, 255 - mask, scale=256)
def burnV2(image, mask):
return 255 - cv2.divide(255 - image, 255 - mask, scale=256)
def rgb_to_sketch(src_image_name):
print('转换中......')
img_rgb = cv2.imread(src_image_name)
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
# 读取图片时直接转换操作
# img_gray = cv2.imread('example.jpg', cv2.IMREAD_GRAYSCALE)
img_gray_inv = 255 - img_gray
img_blur = cv2.GaussianBlur(img_gray_inv, ksize=(21, 21),
sigmaX=0, sigmaY=0)
img_blend = dodgeV2(img_gray, img_blur)
# cv2.imshow('original', img_rgb)
# cv2.imshow('gray', img_gray)
# cv2.imshow('gray_inv', img_gray_inv)
# cv2.imshow('gray_blur', img_blur)
cv2.imwrite(dst_image_name, img_blend)
save_path = getcwd() + "\" + dst_image_name # 保存路径
print('转换完成!!!\n')
print('保存路径:' + save_path)
cv2.imshow(save_path, img_blend)
cv2.waitKey(0)
cv2.destroyAllWindows()
if __name__ == '__main__':
print('请选择图片，路径不要含中文：')
src_image_name = open_path() # 文件路径
print(src_image_name+'\n')
image_name = ''.join(findall(r'[^\\/:*?"<>|\r\n]+, src_image_name)) # 获取文件名
dst_image_name = 'Sketch_' + image_name
rgb_to_sketch(src_image_name)

复制代码

飘花飞雪 · 发表于 2020-7-19 23:02:30

Hello. 发表于 2020-7-19 23:00
哈哈，二值化会更清晰
我有个处理图片的代码，你要不要看一看（虽然可能没啥用）

谢谢了，我参考一下

账号		自动登录	找回密码
密码			立即注册