|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
我建了一个文件hello.html :
- <div>
- <ul>
- <li class="item-0"><a href="link1.html">first item</a></li>
- <li class="item-1"><a href="link2.html">second item</a></li>
- <li class="item-inactive"><a href="link3.html">third item</a></li>
- <li class="item-1"><a href="link4.html">fourth item</a></li>
- <li class="item-0"><a href="link5.html">fifth item</a>
- </ul>
- </div
复制代码 读取文件,代码:
- from lxml import etree
- html = etree.parse("hello.html")
- result = etree.tostring(html,pretty_print = ture)
- print(result)
复制代码 为什么会出现以下错误呢?
- Traceback (most recent call last):
- File "C:/Users/Administrator/AppData/Local/Programs/Python/Python38/爬虫学习5.py", line 2, in <module>
- html = etree.parse("hello.html")
- File "src\lxml\etree.pyx", line 3521, in lxml.etree.parse
- File "src\lxml\parser.pxi", line 1839, in lxml.etree._parseDocument
- File "src\lxml\parser.pxi", line 1865, in lxml.etree._parseDocumentFromURL
- File "src\lxml\parser.pxi", line 1769, in lxml.etree._parseDocFromFile
- File "src\lxml\parser.pxi", line 1163, in lxml.etree._BaseParser._parseDocFromFile
- File "src\lxml\parser.pxi", line 601, in lxml.etree._ParserContext._handleParseResultDoc
- File "src\lxml\parser.pxi", line 711, in lxml.etree._handleParseResult
- File "src\lxml\parser.pxi", line 640, in lxml.etree._raiseParseError
- File "hello.html", line 8
- lxml.etree.XMLSyntaxError: Opening and ending tag mismatch: li line 7 and ul, line 8, column 11
复制代码
本帖最后由 Twilight6 于 2020-7-4 10:21 编辑
起始标签和末尾标签不匹配,第07 行 和 09 行分别少了 </li> 和 >
- <div>
- <ul>
- <li class="item-0"><a href="link1.html">first item</a></li>
- <li class="item-1"><a href="link2.html">second item</a></li>
- <li class="item-inactive"><a href="link3.html">third item</a></li>
- <li class="item-1"><a href="link4.html">fourth item</a></li>
- <li class="item-0"><a href="link5.html">fifth item</a></li>
- </ul>
- </div>
复制代码
而且代码中的 ture 应该改成 True 吧?
|
|