文件系统课后题

Tihool · 发表于 2022-3-2 16:20:06

马上注册，结交更多好友，享用更多功能^_^

您需要登录才可以下载或查看，没有账号？立即注册

x

为什么我的代码和小甲鱼课后题的代码都报错啊，但是小甲鱼的代码可以多读取一个文件

Tihool · 发表于 2022-3-2 16:20:39

用了encoding直接报错，连文件都打不开

python爱好者. · 发表于 2022-3-2 16:25:28

Tihool 发表于 2022-3-2 16:20
用了encoding直接报错，连文件都打不开

py 文件用 utf8 编码解不出来

Tihool · 发表于 2022-3-2 16:45:21

python爱好者. 发表于 2022-3-2 16:25
py 文件用 utf8 编码解不出来

问一下如图的代码哪错了

python爱好者. · 发表于 2022-3-2 17:30:22

Tihool 发表于 2022-3-2 16:45
问一下如图的代码哪错了

有时候 gbk 不能解出所有的文件，看看动动手第 4 题第 17 行代码里的说明，可以使用这种方法来处理！

isdkz · 发表于 2022-3-2 18:33:10

代码并没有错，这是因为你目录下的文件下不一定统一编码导致的，

你可以看一下我之前的帖子：https://fishc.com.cn/forum.php?m ... 057&pid=5736828

python爱好者. · 发表于 2022-3-2 18:51:36

对

Tihool · 发表于 2022-3-2 19:45:09

isdkz 发表于 2022-3-2 18:33
代码并没有错，这是因为你目录下的文件下不一定统一编码导致的，

你可以看一下我之前的帖子：https://fi ...

UnicodeDecodeError: 'gb2312' codec can't decode byte 0xf4 in position 126: illegal multibyte sequence
贴里的代码包括我自己改完的代码都是出现这条错误信息

isdkz · 发表于 2022-3-3 07:13:31

Tihool 发表于 2022-3-2 19:45
UnicodeDecodeError: 'gb2312' codec can't decode byte 0xf4 in position 126: illegal multibyte seque ...

贴里的代码的异常信息放上来给我看看

Tihool · 发表于 2022-3-4 20:52:16

isdkz 发表于 2022-3-3 07:13
贴里的代码的异常信息放上来给我看看

Traceback (most recent call last):
  File "F:\python作业库\文件\TEST (2).py", line 102, in <module>
search_files(key, detail)
  File "F:\python作业库\文件\TEST (2).py", line 92, in search_files
key_dict = search_in_file(each_txt_file, key)
  File "F:\python作业库\文件\TEST (2).py", line 65, in search_in_file
text = raw.decode(chardet.detect(raw)['encoding'])  # 加上这句
UnicodeDecodeError: 'gb2312' codec can't decode byte 0xf4 in position 126: illegal multibyte sequence

isdkz · 发表于 2022-3-4 20:56:43

本帖最后由 isdkz 于 2022-3-4 20:59 编辑

Tihool 发表于 2022-3-4 20:52
Traceback (most recent call last):
File "F:\python作业库\文件\TEST (2).py", line 102, in
...

你用这个代码试试，cchardet 库检测编码的准确度和速度都比 chardet 库要好

import os
import cchardet # 加上这句
def print_pos(key_dict):
keys = key_dict.keys()
keys = sorted(keys) # 由于字典是无序的，我们这里对行数进行排序
for each_key in keys:
print('关键字出现在第 %s 行，第 %s 个位置。' % (each_key, str(key_dict[each_key])))
def pos_in_line(line, key):
pos = []
begin = line.find(key)
while begin != -1:
pos.append(begin + 1) # 用户的角度是从1开始数
begin = line.find(key, begin+1) # 从下一个位置继续查找
return pos
def search_in_file(file_name, key):
with open(file_name, 'rb') as f: # 改这里
raw = f.read() # 加上这句
if not raw: # 加上这句
return # 加上这句
text = raw.decode(cchardet.detect(raw)['encoding']) # 加上这句
lines = text.splitlines()
count = 0 # 记录行数
key_dict = dict() # 字典，用户存放key所在具体行数对应具体位置
for each_line in lines: # f改成lines
count += 1
if key in each_line:
pos = pos_in_line(each_line, key) # key在每行对应的位置
key_dict[count] = pos
# f.close() 删掉这句
return key_dict
def search_files(key, detail):
all_files = os.walk(os.getcwd())
txt_files = []
for i in all_files:
for each_file in i[2]:
if os.path.splitext(each_file)[1] == '.txt': # 根据后缀判断是否文本文件
each_file = os.path.join(i[0], each_file)
txt_files.append(each_file)
for each_txt_file in txt_files:
key_dict = search_in_file(each_txt_file, key)
if key_dict:
print('================================================================')
print('在文件【%s】中找到关键字【%s】' % (each_txt_file, key))
if detail.lower() == 'yes':
print_pos(key_dict)
key = input('请将该脚本放于待查找的文件夹内，请输入关键字：')
detail = input('请问是否需要打印关键字【%s】在文件中的具体位置（YES/NO）：' % key)
search_files(key, detail)

复制代码

Tihool · 发表于 2022-3-4 21:24:21

isdkz 发表于 2022-3-4 20:56
你用这个代码试试，cchardet 库检测编码的准确度和速度都比 chardet 库要好

这个可以，谢谢

isdkz · 发表于 2022-3-4 21:26:51

Tihool 发表于 2022-3-4 21:24
这个可以，谢谢

不客气，如果可以的话给我点动力呗

账号		自动登录	找回密码
密码			立即注册