You are not logged in.

#1 2008-05-02 04:21:26

em4r1z
Member
Registered: 2008-04-27
Posts: 5

Regarding Language, Locale and Encoding

I installed Archlinux 2008.3-1 core on a laptop and I want Spanish as the main system language. I chose es_ES.utf8 (and executed locale-gen) but there are encoding errors in the system.
a) While the system is in Spanish, an square is shown instead of accented letters (this is 'solved' by enabling the es_ES locale.)
b) Many man-pages are still in English, but the apostrophe (') cannot be displayed. Within the translated man-pages, some can display Spanish characters but not the apostrophe (like the page for man) while others display none (like the passwd page.)

Now my questions:
1. Why does one language need more than one locale (i.e. es_ES, es_ES.utf8 and es_ES@euro)?
2. I guess that the man-pages that cannot display Spanish characters are old pages and that an updated version of them should solve the problem. Where can I find the latest translation of the ones used in Archlinux?
3. If only some man-pages are translated to Spanish, is it safe to delete them and use only the English ones? While I want the whole environment in Spanish, I don't need the man-pages (or the command line dialogues) in Spanish.

Last edited by em4r1z (2008-05-02 04:25:59)

Offline

#2 2008-05-02 12:46:19

berbae
Member
From: France
Registered: 2007-02-12
Posts: 1,302

Re: Regarding Language, Locale and Encoding

em4r1z wrote:

Many man-pages are still in English, but the apostrophe (') cannot be displayed.

For this issue try to edit the '/usr/share/groff/site-tmac/man.local' file to get :

.\" This file is loaded after an-old.tmac.
.\" Put any local modifications to an-old.tmac here.
.if '\*[.T]'utf8' \
.  char \- \N'45'
.  char - \N'45'
.  char ' \N'39'
.  char \' \N'39'
..

.if '\*[.T]'ps' \
.  char \- \N'45'
.  char - \N'45'
.  char ' \N'39'
.  char \' \N'39'
..
em4r1z wrote:

While I want the whole environment in Spanish, I don't need the man-pages (or the command line dialogues) in Spanish.

You can force man to use the English pages using the command :
LANG=C man <something>
And you can create an alias in your ~/.bashrc file :
alias manc='LANG=C man'
To get the English pages you run 'manc <something>'

For your other coding problems, it can be that some manual pages are UTF8 encoded and some others are ISO-8859-15 encoded, a workaround is also possible for that.
Greetings.

Offline

#3 2008-05-02 20:03:56

em4r1z
Member
Registered: 2008-04-27
Posts: 5

Re: Regarding Language, Locale and Encoding

Thanks on your reply. I edited the man.local file (while only having the es_ES.utf8 enabled and set as default in rc.conf), restarted and the 'problem' persists. As I said, all Spanish characters are correctly displayed (except those in some  differently encoded man-pages) if I set the es_ES.iso885915@euro as default. What I want to know is why I need more than one locale for one language, shouldn't UTF-8 be the only one enabled?

If the Spanish man-pages aren't complete and some are encoded differently, can I delete them an only use the English ones? I read your alias workaround, but I don't see the point of having translated pages that I'm not going to use.

Offline

#4 2008-05-02 22:21:01

berbae
Member
From: France
Registered: 2007-02-12
Posts: 1,302

Re: Regarding Language, Locale and Encoding

em4r1z wrote:

I edited the man.local file (while only having the es_ES.utf8 enabled and set as default in rc.conf), restarted and the 'problem' persists.

This trick is only for the apostrophe character not being displayed. It is totally separate from the coding problem.
Can you confirm that you don't get the apostrophe with it, and can you give a manual page example where this appears ?
It's not clear to me that you put as necessary all the lines from my preceding post in the indicated file, not only the 'utf8' related lines.

em4r1z wrote:

As I said, all Spanish characters are correctly displayed (except those in some  differently encoded man-pages) if I set the es_ES.iso885915@euro as default. What I want to know is why I need more than one locale for one language, shouldn't UTF-8 be the only one enabled?

I agree that it could be better like that, but in reality the two encodings are used for the translated manual pages from the upstream source codes. So that cannot be easily changed.
Again it's not clear to me what is the value of your LOCALE= line in the /etc/rc.conf file. Do you change it to get your Spanish characters in the manual pages ?

em4r1z wrote:

If the Spanish man-pages aren't complete and some are encoded differently, can I delete them an only use the English ones?

But the translated manual pages are integrated in the Arch packages with their original encoding, so it doesn't seem a good thing to delete them directly, apart from rebuilding the packages without them, using the ABS tree.

Offline

#5 2008-05-03 10:53:10

em4r1z
Member
Registered: 2008-04-27
Posts: 5

Re: Regarding Language, Locale and Encoding

berbae wrote:

Can you confirm that you don't get the apostrophe with it, and can you give a manual page example where this appears?

If es_ES.utf8 is the only locale (the only one enabled in locale.gen and then generated by locale-gen) and it's the default locale in /etc/rc.conf, the apostrophe doesn't appear in the man-page of "man" after adding all those lines to the man.local file.

berbae wrote:

Again it's not clear to me what is the value of your LOCALE= line in the /etc/rc.conf file. Do you change it to get your Spanish characters in the manual pages?

Right now is es_ES.iso885915@euro, which is also the only locale enabled in locale.gen and generated by locale-gen). I chose it because I can see Spanish characters in the shell (an easy way to test it is #man <something-weird>, as the error will produce the word "página") and all characters within the man-pages (bar pages like "passwd", which are always displayed with odd symbols regardless of the chosen Spanish locale.)
I wanted to use the UTF-8 one, but if one or both ISO-8859x are required anyway and they display all the characters, I don't see the point of enabling the UTF-8 one. This may change after installing a desktop environment or a window manager, but right now the UTF-8 Spanish locale is useless.

Last edited by em4r1z (2008-05-03 10:55:28)

Offline

#6 2008-05-03 13:29:56

berbae
Member
From: France
Registered: 2007-02-12
Posts: 1,302

Re: Regarding Language, Locale and Encoding

I tried the Spanish translated manual page on my machine, and I got the apostrophe when I run:
man /usr/share/man/es/man1/man.1.gz
The display is good with all the Spanish characters too.
And I have in /etc/rc.conf :
LOCALE="fr_FR@euro"
I didn't change that in all the tests I made !
But in this case, it doesn't seem to be related with the man.local trick, as I tried with and without the modifications in the file, and I got the apostrophe (single quote) in each case, in the virtual console.
So I think about another thing with the character display in the virtual console (I didn't realize you work only in console) :
in the /etc/profile.d folder do you have a 'unicode_stop.sh' script ?
if not, create one which contains :

if [ "$CONSOLE" = "" -a "$TERM" = "linux" -a -t 1 ]; then /usr/bin/unicode_stop; fi

don't forget to make it executable.
logout/login
and try again 'man man'. Do you get the apostrophe now ?

Concerning 'man passwd', this is one of the translated manual pages which are UTF-8 encoded. Here is what I did to get the right display.
I created a file '/etc/manu.conf' same as /etc/man.conf except for the NROFF line.

diff man.conf manu.conf
96c96
< NROFF         /usr/bin/nroff  -mandoc -c
---
> NROFF         iconv -cs -f UTF-8 -t ISO-8859-15|/usr/bin/nroff  -mandoc -c

To get the correct display I run
man -C /etc/manu.conf passwd
In fact I created an alias in ~/.bashrc :
alias manu='man -C /etc/manu.conf'
and I run
manu passwd
I use 'manu' for the UTF-8 encoded pages, 'man' for the others, and 'manc' for an English display.
It took me hours to find these tricks, maybe you have to adapt them to your case.
I hope one of these ideas will do it for you.

Last edited by berbae (2008-05-03 21:33:38)

Offline

#7 2008-05-03 19:01:17

inXistant
Member
From: Montreal, Canada
Registered: 2008-04-11
Posts: 51
Website

Re: Regarding Language, Locale and Encoding

That's funny... With the encoding fr_CA.utf8 I also have problems with apostrophes however these are not in the man pages, these are on the internet or in mutt... Any ideas?


Nothing is sacred

Offline

#8 2008-05-03 21:49:14

berbae
Member
From: France
Registered: 2007-02-12
Posts: 1,302

Re: Regarding Language, Locale and Encoding

To inXistant
If you use Firefox, you can change the encoding in :
Edit/Preferences/Content/Fonts & Colors/Advanced.../Character Encoding
You can try to choose another default one.

I don't know for other browsers or for mutt. But I think you can search how to change the default configuration in them too.

Offline

#9 2008-05-03 22:23:39

inXistant
Member
From: Montreal, Canada
Registered: 2008-04-11
Posts: 51
Website

Re: Regarding Language, Locale and Encoding

I know that about firefox and I tried. However, I have problems only with apostrophe, it does not seems to be a problem of encoding but rather an oddity...


Nothing is sacred

Offline

Board footer

Powered by FluxBB