[SOLVED] Converting from locale encoding in Python3

AaronBP · 2012-08-10 17:49:06

Is this the correct approach to this?

#!/usr/bin/python
import subprocess
import locale

locale.setlocale(locale.LC_ALL, '')
test = subprocess.check_output(['uname', '-m'])
test = test.decode()
print(test)

Will test now be UTF-8 if this is run on some machine with some strange encoding?

Last edited by AaronBP (2012-08-11 15:43:20)

lunar · 2012-08-10 18:26:28

At no point "test" is guaranteed to be "UTF-8".

At first, it's bound to a byte string in some arbitrary, locale-specific encoding. The next line tries to decode that byte string assuming UTF-8 encoding, and binds "test" to a unicode string (which is not the same as an UTF-8 encoded byte string). Since you're not actually using the locale's encoding to decode the byte string, this likely fails on systems with “strange” locale configurations, and even on systems with a sane encoding configuration (i.e. japanese 8 bit encodings or something like that).

You have to use ".decode(locale.getpreferredencoding())" in order to correctly convert a locale-specific byte string into a unicode string.

Last edited by lunar (2012-08-10 18:26:49)

AaronBP · 2012-08-10 18:47:13

Ah, thanks for the clarification. I thought I was missing something along those lines. Also, I thought Python 3 used UTF-8 internally? I must have misread. I'm not going to lie and say this Unicode thing doesn't confuse the heck out of me.

lunar · 2012-08-10 20:50:19

Read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets. This article will help you to really understand Unicode from a programmer's point of view.

Mr.Elendig · 2012-08-11 11:59:25

Pragmatic Unicode, or, How do I stop the pain?

AaronBP · 2012-08-11 15:43:08

Very informative links, thanks.

Arch Linux

#1 2012-08-10 17:49:06

[SOLVED] Converting from locale encoding in Python3

#2 2012-08-10 18:26:28

Re: [SOLVED] Converting from locale encoding in Python3

#3 2012-08-10 18:47:13

Re: [SOLVED] Converting from locale encoding in Python3

#4 2012-08-10 20:50:19

Re: [SOLVED] Converting from locale encoding in Python3

#5 2012-08-11 11:59:25

Re: [SOLVED] Converting from locale encoding in Python3

#6 2012-08-11 15:43:08

Re: [SOLVED] Converting from locale encoding in Python3

Board footer