You are not logged in.
Is this the correct approach to this?
#!/usr/bin/python
import subprocess
import locale
locale.setlocale(locale.LC_ALL, '')
test = subprocess.check_output(['uname', '-m'])
test = test.decode()
print(test)
Will test now be UTF-8 if this is run on some machine with some strange encoding?
Last edited by AaronBP (2012-08-11 15:43:20)
Offline
At no point "test" is guaranteed to be "UTF-8".
At first, it's bound to a byte string in some arbitrary, locale-specific encoding. The next line tries to decode that byte string assuming UTF-8 encoding, and binds "test" to a unicode string (which is not the same as an UTF-8 encoded byte string). Since you're not actually using the locale's encoding to decode the byte string, this likely fails on systems with “strange” locale configurations, and even on systems with a sane encoding configuration (i.e. japanese 8 bit encodings or something like that).
You have to use ".decode(locale.getpreferredencoding())" in order to correctly convert a locale-specific byte string into a unicode string.
Last edited by lunar (2012-08-10 18:26:49)
Offline
Ah, thanks for the clarification. I thought I was missing something along those lines. Also, I thought Python 3 used UTF-8 internally? I must have misread. I'm not going to lie and say this Unicode thing doesn't confuse the heck out of me.
Offline
Read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets. This article will help you to really understand Unicode from a programmer's point of view.
Offline
Evil #archlinux@libera.chat channel op and general support dude.
. files on github, Screenshots, Random pics and the rest
Offline