[Py] Odd Unicode characters appearing?

linkmaster03 · 2009-06-07 23:03:40

OK I have a script like this:

#!/usr/bin/env python
# -*- coding: iso-8859-15 -*-

bold = '\033[1m'
unbold = '\033[0;0m'
word = 'mo⋅tor⋅cy⋅cle'
print bold + word.replace("\xe2", "'") + unbold

I just want it to show mo'tor'cy'cle. This works as expected in urxvt:

95107-Jun-Sun_19:06.png

...but not in xterm:

89907-Jun-Sun_19:06-411345273.png

Why are those odd unicode characters appearing, and how can I get rid of them?

Peasantoid · 2009-06-07 23:25:58

Because xterm doesn't have Unicode support?

Those bullet/whatever characters are multibyte (three, to be exact). The rest don't show up in urxvt for whatever reason, but they do in xterm.

Procyon · 2009-06-07 23:42:10

Try

word.replace("\xe2", "'").decode("ascii","ignore")

linkmaster03 · 2009-06-07 23:58:04

Peasantoid: Ah, I didn't know about multibyte characters until now. I went into the interpreter and found that the full code is \xe2\x8b\x85.

$ python
Python 2.6.2 (r262:71600, Jun  6 2009, 10:55:16) 
[GCC 4.4.0 20090526 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> str = "⋅"
>>> str
'\xe2\x8b\x85'

It works now.

Procyon: This works too. Thanks for letting me know about the decode() function; it will probably come in handy for a later project.

Thanks guys!

Arch Linux

#1 2009-06-07 23:03:40

[Py] Odd Unicode characters appearing?

#2 2009-06-07 23:25:58

Re: [Py] Odd Unicode characters appearing?

#3 2009-06-07 23:42:10

Re: [Py] Odd Unicode characters appearing?

#4 2009-06-07 23:58:04

Re: [Py] Odd Unicode characters appearing?

Board footer