[已解决]Python3的默认编码是什么

一模一样早 · 发表于 2017-8-4 02:17:20

马上注册，结交更多好友，享用更多功能^_^

您需要登录才可以下载或查看，没有账号？立即注册

x

字符串默认编码是Unicode吗？

最佳答案

月排行榜 / 总排行榜

SixPy

2017-8-4 11:35:09

Strings
A string is a sequence of values that represent Unicode code points. All the code points in the range U+0000 - U+10FFFF can be represented in a string. Python doesn’t have a char type; instead, every code point in the string is represented as a string object with length 1. The built-in function ord() converts a code point from its string form to an integer in the range 0 - 10FFFF; chr() converts an integer in the range 0 - 10FFFF to the corresponding length 1 string object. str.encode() can be used to convert a str to bytes using the given text encoding, and bytes.decode() can be used to achieve the opposite.

字符串
字符串是由表示Unicode码点的值组成的一个序列。在U+0000 - U+10FFFF范围内的所有码点都可以在字符串中表示。Python没有char类型；字符串中个每个码点通过长度为1的字符串对象表示。内建函数ord()将一个码点从字符串形式转换为范围在0 - 10FFFF之间的一个整数；chr()将0 - 10FFFF范围之间的一个整数转换为对应的长度为1的字符串对象。str.encode()可以用来使用给定的文本编码将str转换为bytes，bytes.decode()可以用来实现相反的操作。

跳转到最佳答案楼层

小甲鱼 · 发表于 2017-8-4 02:37:21

国际惯例 utf-8

Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:57:36) [MSC v.1900 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import sys
>>> sys.getdefaultencoding()
'utf-8'
>>>

复制代码

新手·ing · 发表于 2017-8-4 06:44:45

小甲鱼发表于 2017-8-4 02:37
国际惯例 utf-8

起的这么早

冬雪雪冬 · 发表于 2017-8-4 08:13:42

新手·ing 发表于 2017-8-4 06:44
起的这么早

这个点他还没睡。

新手·ing · 发表于 2017-8-4 08:16:49

冬雪雪冬发表于 2017-8-4 08:13
这个点他还没睡。

每天都熬夜那么晚！？

冬雪雪冬 · 发表于 2017-8-4 08:19:05

新手·ing 发表于 2017-8-4 08:16
每天都熬夜那么晚！？

大神都是要熬夜的。

新手·ing · 发表于 2017-8-4 08:19:51

冬雪雪冬发表于 2017-8-4 08:19
大神都是要熬夜的。

对身体不好......

SixPy · 发表于 2017-8-4 11:35:09

Strings
A string is a sequence of values that represent Unicode code points. All the code points in the range U+0000 - U+10FFFF can be represented in a string. Python doesn’t have a char type; instead, every code point in the string is represented as a string object with length 1. The built-in function ord() converts a code point from its string form to an integer in the range 0 - 10FFFF; chr() converts an integer in the range 0 - 10FFFF to the corresponding length 1 string object. str.encode() can be used to convert a str to bytes using the given text encoding, and bytes.decode() can be used to achieve the opposite.

字符串
字符串是由表示Unicode码点的值组成的一个序列。在U+0000 - U+10FFFF范围内的所有码点都可以在字符串中表示。Python没有char类型；字符串中个每个码点通过长度为1的字符串对象表示。内建函数ord()将一个码点从字符串形式转换为范围在0 - 10FFFF之间的一个整数；chr()将0 - 10FFFF范围之间的一个整数转换为对应的长度为1的字符串对象。str.encode()可以用来使用给定的文本编码将str转换为bytes，bytes.decode()可以用来实现相反的操作。

新手·ing · 发表于 2017-8-4 11:39:44

SixPy 发表于 2017-8-4 11:35

不是 UTF-8m 吗

SixPy · 发表于 2017-8-4 12:16:24

新手·ing 发表于 2017-8-4 11:39
不是 UTF-8m 吗

>>> import sys;sys.getdefaultencoding()
'utf-8'
>>> s='字符串默认编码'
>>> bui=bytes(s,'unicode_internal')
>>> bui
b'W[&{2N\xd8\x9e\xa4\x8b\x16\x7f\x01x'
>>> u16le=bytes(s,'utf_16_le')
>>> u16le
b'W[&{2N\xd8\x9e\xa4\x8b\x16\x7f\x01x'
>>> bui == u16le
True
>>> s[0]
'字'
>>> bui[0],bui[1]
(87, 91)
>>> chr(bui[1]*256 + bui[0])
'字'
>>>

复制代码

SixPy · 发表于 2017-8-4 12:21:54

小甲鱼发表于 2017-8-4 02:37
国际惯例 utf-8

getdefaultencoding 不是字符串编码，而是python 读写 .py源文件的默认编码

python3的字符串编码，是在内存中用 unicode16-le 格式存储。

看 10楼的例子。

账号		自动登录	找回密码
密码			立即注册

[已解决]Python3的默认编码是什么

马上注册，结交更多好友，享用更多功能^_^

浏览过的版块