You are not logged in.
Is this the correct approach to this?
#!/usr/bin/python
import subprocess
import locale
locale.setlocale(locale.LC_ALL, '')
test = subprocess.check_output(['uname', '-m'])
test = test.decode()
print(test)Will test now be UTF-8 if this is run on some machine with some strange encoding?
Last edited by AaronBP (2012-08-11 15:43:20)
Offline
At no point "test" is guaranteed to be "UTF-8".
At first, it's bound to a byte string in some arbitrary, locale-specific encoding. The next line tries to decode that byte string assuming UTF-8 encoding, and binds "test" to a unicode string (which is not the same as an UTF-8 encoded byte string). Since you're not actually using the locale's encoding to decode the byte string, this likely fails on systems with “strange” locale configurations, and even on systems with a sane encoding configuration (i.e. japanese 8 bit encodings or something like that).
You have to use ".decode(locale.getpreferredencoding())" in order to correctly convert a locale-specific byte string into a unicode string.
Last edited by lunar (2012-08-10 18:26:49)
Offline
Ah, thanks for the clarification. I thought I was missing something along those lines. Also, I thought Python 3 used UTF-8 internally? I must have misread. I'm not going to lie and say this Unicode thing doesn't confuse the heck out of me. ![]()
Offline
Read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets. This article will help you to really understand Unicode from a programmer's point of view.
Offline
Evil #archlinux@libera.chat channel op and general support dude.
. files on github, Screenshots, Random pics and the rest
Offline