|
代码如下:
#coding:utf-8
from PIL import Image
import pytesseract
def test():
im = Image.open(r"pic.gif")
vcode = pytesseract.image_to_string(im)
print vcode
执行以上代码进行简单验证码识别的时候会抛出一个异常:
Traceback (most recent call last):
File "D:\test\vcode.py", line 15, in <module>
main()
File "D:\test\vcode.py", line 9, in main
test()
File "D:\test\test.py", line 8, in test
vcode = pytesseract.image_to_string(im)
File "build\bdist.win32\egg\pytesseract\pytesseract.py", line 143, in image_to_string
File "D:\Program Files (x86)\Python\Python27\lib\site-packages\PIL\Image.py", line 1749, in split
self.load()
File "D:\Program Files (x86)\Python\Python27\lib\site-packages\PIL\ImageFile.py", line 232, in load
"(%d bytes not processed)" % len(b))
IOError: image file is truncated (5 bytes not processed)
解决办法是,再添加如下2句代码:
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
最终,完整的代码如下:
#coding:utf-8
from PIL import Image
import pytesseract
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
def test():
im = Image.open(r"pic.gif")
vcode = pytesseract.image_to_string(im)
print vcode
相关文章:
关于利用python进行验证码识别的一些想法:http://www.cnblogs.com/xiaowuyi/archive/2012/09/10/2675286.html
python利用pytesser模块实现图片文字识别:http://www.jinglingshu.org/?p=9281
验证码图片字符识别两种python实现方法:http://vipscu.blog.163.com/blog/static/18180837220134234528457/
python模拟登陆登陆一:验证码与cookies的同步处理思路:http://www.dabu.info/python-login-crawler-captcha-cookies.html
原文地址:http://www.cnblogs.com/hongfei/p/4436767.html |
|