You are not logged in.

#1 2009-06-07 23:03:40

linkmaster03
Member
Registered: 2008-12-27
Posts: 269

[Py] Odd Unicode characters appearing?

OK I have a script like this:

#!/usr/bin/env python
# -*- coding: iso-8859-15 -*-

bold = '\033[1m'
unbold = '\033[0;0m'
word = 'mo⋅tor⋅cy⋅cle'
print bold + word.replace("\xe2", "'") + unbold

I just want it to show mo'tor'cy'cle. This works as expected in urxvt:

95107-Jun-Sun_19:06.png

...but not in xterm:

89907-Jun-Sun_19:06-411345273.png

Why are those odd unicode characters appearing, and how can I get rid of them?

Offline

#2 2009-06-07 23:25:58

Peasantoid
Member
Registered: 2009-04-26
Posts: 928
Website

Re: [Py] Odd Unicode characters appearing?

Because xterm doesn't have Unicode support?

Those bullet/whatever characters are multibyte (three, to be exact). The rest don't show up in urxvt for whatever reason, but they do in xterm.

Offline

#3 2009-06-07 23:42:10

Procyon
Member
Registered: 2008-05-07
Posts: 1,819

Re: [Py] Odd Unicode characters appearing?

Try

word.replace("\xe2", "'").decode("ascii","ignore")

Offline

#4 2009-06-07 23:58:04

linkmaster03
Member
Registered: 2008-12-27
Posts: 269

Re: [Py] Odd Unicode characters appearing?

Peasantoid: Ah, I didn't know about multibyte characters until now. I went into the interpreter and found that the full code is \xe2\x8b\x85.

$ python
Python 2.6.2 (r262:71600, Jun  6 2009, 10:55:16) 
[GCC 4.4.0 20090526 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> str = "⋅"
>>> str
'\xe2\x8b\x85'

It works now. smile

Procyon: This works too. Thanks for letting me know about the decode() function; it will probably come in handy for a later project.

Thanks guys!

Offline

Board footer

Powered by FluxBB