You are not logged in.

#1 2014-11-06 14:15:02

gallardopablo
Member
Registered: 2014-11-06
Posts: 6

Special characters in files issue (Solved)

Hi all! When I download certains files that have special characters (like áàñç) made in Windows they show as '?' in xterm and a questionmark in a rhombus with "(invalid encoding)" in thunar. If I create a file with those special characters there's no problem. I have googled that and I have found that this is regarding a difference in the encoding in Windows. So my question is: Does it have a solution? Maybe could it be a package missing the I have to install?

I've tried implementing the "solutions" I googled and searched in threads of this forum but any of them resolve my problem.

Here is the output of locale -a.

$ locale -a
C
es_AR
es_AR.iso88591
es_AR.utf8
es_ES
es_ES@euro
es_ES.iso88591
es_ES.iso885915@euro
es_ES.utf8
POSIX
spanish

I would really appreciate your help on this. Thank you.

Last edited by gallardopablo (2014-11-06 15:32:11)

Offline

#2 2014-11-06 14:19:51

runical
Member
From: The Netherlands
Registered: 2012-03-03
Posts: 896

Re: Special characters in files issue (Solved)

What kind of files are we talking about? If I am not mistaken, windows does not use UTF-8 for text files for example. This can cause some problems with the character encodings in UNIX systems, where pretty much everything is encoded in UTF-8.

Offline

#3 2014-11-06 14:22:08

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 30,456
Website

Re: Special characters in files issue (Solved)

Doesn't Windows generally use utf16 encoding?  Your locale (and perhaps most common *nix programs) generally use utf8 for unicode.  You can use icov to convert between these (`man iconv`).

EDIT: `file` will tell you the current encoding is, then use iconv to conver to UTF-8.


"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman

Offline

#4 2014-11-06 14:27:58

nomorewindows
Member
Registered: 2010-04-03
Posts: 3,527

Re: Special characters in files issue (Solved)

Bash keeps giving me an accented a on some file operations and I'm not even using any fancy locale.  It's on regular files with no weird symbols.

Last edited by nomorewindows (2014-11-06 14:29:15)


I may have to CONSOLE you about your usage of ridiculously easy graphical interfaces...
Look ma, no mouse.

Offline

#5 2014-11-06 14:28:13

gallardopablo
Member
Registered: 2014-11-06
Posts: 6

Re: Special characters in files issue (Solved)

Thank you for the reply. My problem is just with the name, not the content. I've try iconv:

Downloads]$ ls
Principios_de_Dise?o_OO.pdf
iconv -f utf16 -t utf8 Principios_de_Dise?o_OO.pdf
Downloads]$ ls
Principios_de_Dise?o_OO.pdf

The actual name of the file is: Principios_de_Diseño_OO.pdf

Offline

#6 2014-11-06 14:29:32

nomorewindows
Member
Registered: 2010-04-03
Posts: 3,527

Re: Special characters in files issue (Solved)

What does it do in midnight commander?


I may have to CONSOLE you about your usage of ridiculously easy graphical interfaces...
Look ma, no mouse.

Offline

#7 2014-11-06 14:30:43

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 30,456
Website

Re: Special characters in files issue (Solved)

ah, so you need to convert the text of the filename.  I don't know any automatic way to do that, but you could write a script or alias like the following untested code:

newname=$(iconv -f UTF-16 -t UTF-8 <(echo $1))
mv "$1" "$newname"

This assumes it is UTF-16, you could expand this code a bit to do more checking first.


"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman

Offline

#8 2014-11-06 14:32:00

gallardopablo
Member
Registered: 2014-11-06
Posts: 6

Re: Special characters in files issue (Solved)

Midnight commander shows the name with a funny question mark that I can't copy here.

Offline

#9 2014-11-06 14:34:26

nomorewindows
Member
Registered: 2010-04-03
Posts: 3,527

Re: Special characters in files issue (Solved)

Usually mc shows characters that bash doesn't show, but will able to handle the funny names that bash doesn't like.


I may have to CONSOLE you about your usage of ridiculously easy graphical interfaces...
Look ma, no mouse.

Offline

#10 2014-11-06 14:37:27

gallardopablo
Member
Registered: 2014-11-06
Posts: 6

Re: Special characters in files issue (Solved)

Trilby wrote:

ah, so you need to convert the text of the filename.  I don't know any automatic way to do that, but you could write a script or alias like the following untested code:

newname=$(iconv -f UTF-16 -t UTF-8 <(echo $1))
mv "$1" "$newname"

This assumes it is UTF-16, you could expand this code a bit to do more checking first.

I've tried the script but that doesn't change the character to ñ hmm.

Offline

#11 2014-11-06 14:46:16

runical
Member
From: The Netherlands
Registered: 2012-03-03
Posts: 896

Re: Special characters in files issue (Solved)

Can you post the output of

locale

Offline

#12 2014-11-06 14:47:47

gallardopablo
Member
Registered: 2014-11-06
Posts: 6

Re: Special characters in files issue (Solved)

runical wrote:

Can you post the output of

locale

sure.

$ locale
LANG=es_AR.UTF-8
LC_CTYPE="es_AR.UTF-8"
LC_NUMERIC="es_AR.UTF-8"
LC_TIME="es_AR.UTF-8"
LC_COLLATE="es_AR.UTF-8"
LC_MONETARY="es_AR.UTF-8"
LC_MESSAGES="es_AR.UTF-8"
LC_PAPER="es_AR.UTF-8"
LC_NAME="es_AR.UTF-8"
LC_ADDRESS="es_AR.UTF-8"
LC_TELEPHONE="es_AR.UTF-8"
LC_MEASUREMENT="es_AR.UTF-8"
LC_IDENTIFICATION="es_AR.UTF-8"
LC_ALL=

Offline

#13 2014-11-06 14:58:44

runical
Member
From: The Netherlands
Registered: 2012-03-03
Posts: 896

Re: Special characters in files issue (Solved)

Hmm, that should be ok (with Trilby's script). You can try UTF-32 instead of UTF-16.

This is all assuming that Trilby's script is correct tongue

Last edited by runical (2014-11-06 15:00:21)

Offline

#14 2014-11-06 15:31:20

gallardopablo
Member
Registered: 2014-11-06
Posts: 6

Re: Special characters in files issue (Solved)

Resolved
Thank you guys for your support. It seems that the problem was resolved before posting here (sorry about that). Before of posting I've done:

locale-gen

Uncommenting in /etc/locale.gen:

es_AR.ISO-8859-1
es_ES.ISO-8859-1

What I've missed
I thought that doing the locale-gen may resolve the problem inmediately, or after a reboot (which I did) but it doesn't. So I've downloaded the files again and I've found that the name is the right one now.

Downloads]$ ls
Principios_de_Diseño_OO.pdf

Thank you again for your time and patience.

Last edited by gallardopablo (2014-11-06 15:37:14)

Offline

Board footer

Powered by FluxBB