鱼C论坛

 找回密码
 立即注册
查看: 4345|回复: 2

读取pdf文件出错:AttributeError: 'PDFDocument' object has no attribute 'seek'

[复制链接]
发表于 2020-11-29 17:42:07 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能^_^

您需要 登录 才可以下载或查看,没有账号?立即注册

x
from pdfminer.pdfparser import PDFParser
from pdfminer.pdfdocument import PDFDocument
from pdfminer.pdfpage import PDFPage
from pdfminer.pdfdevice import PDFDevice
from pdfminer.pdfinterp import PDFResourceManager,PDFPageInterpreter
from pdfminer.converter import PDFPageAggregator
from pdfminer.layout import LTTextBoxHorizontal,LAParams
from pdfminer.pdfpage import PDFTextExtractionNotAllowed

pdf0 = open('E:\\齐大胖\\参考文献\\仅由两条供气管线驱动的气动双腔细管尺蠖机构.pdf','rb')

parser = PDFParser(pdf0)
doc = PDFDocument(parser)

parser.set_document(doc)


resources = PDFResourceManager()
laparam = LAParams()

device = PDFPageAggregator(resources,laparam)

interpreter = PDFPageInterpreter(resources,device)

for i,page in PDFPage.get_pages(doc):
    interpreter.process_page(page)
    layout = device.get_result()

    for out in layout:
        if hasattr(out,'get_text'):
            print(out.get_text())


Traceback (most recent call last):
  File "E:\齐大胖\try.py", line 25, in <module>
    for i,page in PDFPage.get_pages(doc):
  File "E:\PYTHON\lib\site-packages\pdfminer\pdfpage.py", line 120, in get_pages
    parser = PDFParser(fp)
  File "E:\PYTHON\lib\site-packages\pdfminer\pdfparser.py", line 43, in __init__
    PSStackParser.__init__(self, fp)
  File "E:\PYTHON\lib\site-packages\pdfminer\psparser.py", line 515, in __init__
    PSBaseParser.__init__(self, fp)
  File "E:\PYTHON\lib\site-packages\pdfminer\psparser.py", line 169, in __init__
    self.seek(0)
  File "E:\PYTHON\lib\site-packages\pdfminer\psparser.py", line 527, in seek
    PSBaseParser.seek(self, pos)
  File "E:\PYTHON\lib\site-packages\pdfminer\psparser.py", line 199, in seek
    self.fp.seek(pos)
AttributeError: 'PDFDocument' object has no attribute 'seek'
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复

使用道具 举报

发表于 2020-11-29 19:35:00 | 显示全部楼层
Traceback (most recent call last):
  File "E:\齐大胖\try.py", line 25, in <module>        #错误在第25行
    for i,page in PDFPage.get_pages(doc):           # 错误代码
  File "E:\PYTHON\lib\site-packages\pdfminer\pdfpage.py", line 120, in get_pages
    parser = PDFParser(fp)
  File "E:\PYTHON\lib\site-packages\pdfminer\pdfparser.py", line 43, in __init__
    PSStackParser.__init__(self, fp)
  File "E:\PYTHON\lib\site-packages\pdfminer\psparser.py", line 515, in __init__
    PSBaseParser.__init__(self, fp)
  File "E:\PYTHON\lib\site-packages\pdfminer\psparser.py", line 169, in __init__
    self.seek(0)
  File "E:\PYTHON\lib\site-packages\pdfminer\psparser.py", line 527, in seek
    PSBaseParser.seek(self, pos)
  File "E:\PYTHON\lib\site-packages\pdfminer\psparser.py", line 199, in seek
    self.fp.seek(pos)
AttributeError: 'PDFDocument' object has no attribute 'seek'  # 属性错误:“PDFDocument”对象没有属性“seek”
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2020-12-10 09:26:53 | 显示全部楼层
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

小黑屋|手机版|Archiver|鱼C工作室 ( 粤ICP备18085999号-1 | 粤公网安备 44051102000585号)

GMT+8, 2025-1-17 02:58

Powered by Discuz! X3.4

© 2001-2023 Discuz! Team.

快速回复 返回顶部 返回列表