You are not logged in.
I know that this has been discussed already, and that there are solutions to some of the problems, but as the default locale is now a UTF-8 one, it seems important to me that also console apps work with utf-8. Could we maybe construct a clear list of currently non-working apps (and solutions, if possible), so that the devs are encouraged to do the necessary updates?
Those I know of so far are:
'dialog' - which is quite easily fixed, it needs a change of configure options (--with-ncursesw), though I am not sure about the nls bit, I haven't tried that.
'mc' - for which there are patches which are not quite perfect, but at least mc is then usable in a unicode console. I have been using a version with the gentoo patch for a while and it has been good enough for my purposes.
larch: http://larch.berlios.de
Offline
At least mc, nano, coreutils, ncurses, id3lib, taglib etc. should be fixed.
Some links:
http://bbs.archlinux.org/viewtopic.php?p=194235#194235 (edit: oh, that is not for UTF-8, now I posted it as http://bugs.archlinux.org/task/5487)
http://bugs.archlinux.org/task/4652
http://bugs.archlinux.org/task/4418
http://bugs.archlinux.org/task/4756 (see also taglib-rcc and id3lib-rcc in AUR)
to live is to die
Offline
The development version of nano (nano-1.9.99pre1) seems to work, though I only tested it briefly. It needs one extra configure option: --enable-utf8
larch: http://larch.berlios.de
Offline
Go for it, guys! You deserve better support!
Offline
The development version of nano (nano-1.9.99pre1) seems to work, though I only tested it briefly. It needs one extra configure option: --enable-utf8
Yes, it works. There are also a bunch of patches for MC. Ncurses also can be patched, haven't tried this however. And I'm sure I've seen patches for coreutils also.
I use uk_UA.KOI8-U, but want to switch to UTF-8.
I'll try to manage my time and test UTF-8 extensively. Will add one or two VMware machines to my testing collection.
It would be very nice to have full support for both UTF-8 and non-UTF-8 systems in 0.8.
Currently even non-UTF-8 locale support has some bugs (http://bugs.archlinux.org/task/5487, for example).
to live is to die
Offline
Here's a modified PKGBUILD for lynx:
pkgname=lynx
pkgver=2.8.5
pkgrel=5
pkgdesc="A text browser for the World Wide Web"
arch=(i686 x86_64)
depends=('ncurses' 'openssl')
source=(http://lynx.isc.org/release/${pkgname}${pkgver}.tar.gz)
url="http://lynx.isc.org"
md5sums=('5f516a10596bd52c677f9bfd9579bc28')
build() {
cd $startdir/src/${pkgname}2-8-5
./configure --prefix=/usr --with-ssl --with-screen=ncursesw --enable-locale-charset
make || return 1
make DESTDIR=$startdir/pkg install
sed -i "s|^#LOCALE_CHARSET.*|LOCALE_CHARSET:TRUE|" $startdir/pkg/usr/lib/lynx.cfg
sed -i "s|^#ASSUME_CHARSET.*|ASSUME_CHARSET:utf-8|" $startdir/pkg/usr/lib/lynx.cfg
}
larch: http://larch.berlios.de
Offline
Nice. I'll try it. Post it to bugtracker too.
to live is to die
Offline
Ok, few things: with the dialog compile switch, does this break non-utf8 setups at all?
please post here the packages that have problems, and any way to reproduce it (for silly americans like me who don't speak moon-languages ), I will get to them on a case-by-case basis.
Offline
Good to know that you are with us, phrakture! ;-)
I'll get to my Linux box tomorrow. Vmware, patch & makepkg will be my friends.
to live is to die
Offline
Ok, few things: with the dialog compile switch, does this break non-utf8 setups at all?
I'm not absolutely sure, but pretty sure it does, also the mc patch. These console apps don't seem to be very flexible. I think the lynx mod can cope via its option menu with various encodings.
But even if there are such breakages, shouldn't the utf8 versions be the standard ones and the non-utf8 ones be the ones hanging out in AUR?
larch: http://larch.berlios.de
Offline
But even if there are such breakages, shouldn't the utf8 versions be the standard ones and the non-utf8 ones be the ones hanging out in AUR?
I'd agree with that, but then some people might not. In the case of breakages, we may have to figure out if it's worth providing two versions or something goofy. I'd say swith to UTF8 for now and wait for complaints.... /shrug
Offline
The real issue is that utf-8 is not the default locale on someone's machine. You have to switch to it. So I don't see where breaking everyone's terminal on install is a good idea.
That being said, all these packages should be fixed, and hopefully none break
Offline
The real issue is that utf-8 is not the default locale on someone's machine. You have to switch to it. So I don't see where breaking everyone's terminal on install is a good idea.
What do you mean? If I do a fresh install, I get LOCALE="en_US.UTF-8" in rc.conf until I change it. The result is that I can't use (standard) mc at all in a console and as soon as I use non-ASCII characters some of the other apps make a mess. Do you mean 'default' in some other way?
larch: http://larch.berlios.de
Offline
Very nice discussion about UTF guys. I should be using UTF already, but I have been lazy to get information howto. It would be sweet to have UTF wiki pages, like gentoo has (I just googled).
My vote for wiki
Offline
gradgrind wrote:But even if there are such breakages, shouldn't the utf8 versions be the standard ones and the non-utf8 ones be the ones hanging out in AUR?
I'd agree with that, but then some people might not. In the case of breakages, we may have to figure out if it's worth providing two versions or something goofy. I'd say swith to UTF8 for now and wait for complaints.... /shrug
There shouldn't even be a discussion about moving non-UTF-8 packages to AUR. This will break systems for many users which use non-Latin alphabet.
Applications that don't have UTF-8 support (or have it, but it is broken) should be patched. And there should always be a choice which locale and character encoding to use.
I don't see UTF-8 as well established standard for most users' systems in near future (few years at least).
Yes, UTF-8 solves many problems, but for this to be true (and not cause another problems) all applications should support it! Before this don't happen - it is wise to keep support for non-UTF-8 encodings too.
The good news are that applications that are based on Qt or GTK+ already should support UTF-8. See http://bugs.archlinux.org/task/5487, however. BTW, can anyone of devs reading this thread fix this bug?
There are also problems with GTK1 (http://bugs.archlinux.org/task/4652), but I don't think it will be easy to fix them and if it's worth fixing because we have GTK2 for long time.
The bad news - UTF-8 is hard to support in console apps due to their nature. For example, while it's easy to support UTF-8 fonts rendering in X terminal, it's hard do do that in text console. That's why many applications have broken display of non-Latin chars in text mode (especially Cyrillic chars). That's why there are patches for ncurses and mc (patches for slang; slang2 already supports Unicode).
mc-utf8 in Community still has some display glitches.
See http://bugs.archlinux.org/task/4418 for patches for coreutils and better way of patching mc.
BTW, coreutils 6.3 are out, but I don't know if they support Unicode better, haven't seen anything about this in changelog.
to live is to die
Offline
The real issue is that utf-8 is not the default locale on someone's machine. You have to switch to it. So I don't see where breaking everyone's terminal on install is a good idea.
Unless we force the default, which is, IMO, a good idea.
Offline
codemac wrote:The real issue is that utf-8 is not the default locale on someone's machine. You have to switch to it. So I don't see where breaking everyone's terminal on install is a good idea.
Unless we force the default, which is, IMO, a good idea.
IMO forcing default to UTF-8 is bad idea, at least in current situation.
There is LOCALE="en_US.utf8" already in default rc.conf. Isn't it enought?
BTW, does empty CONSOLEFONT= work fine with UTF-8? I remember older Arch versions used LatArCyrHeb16 (not sure if I named it correctly).
to live is to die
Offline
BTW, does empty CONSOLEFONT= work fine with UTF-8? I remember older Arch versions used LatArCyrHeb16 (not sure if I named it correctly).
I think it's missing quite a lot of glyphs (I mean the 'default' font), but it's ok for some of us West/Central Europeans (the few non-ASCII German characters are ok).
larch: http://larch.berlios.de
Offline
I must say UTF-8 works great for me.
The only console application I use is vim and it supports UTF-8 locales just fine.
Some of the rare, other console applications I use, that are dialog based, like "make menuconfig" usually don't need to work with anything but ASCII anyway.
And the only other problem is GTK+1 ... but I don't applications based on it either... It's unfortunate that I need to have it installed because of a stupid dependacy in kdeutils for xmms.
Offline
I get a few funky characters with UTF-8 with xcalc. I take it xcalc can't handle UTF-8?
Offline
I get issues with lynx (i'm guessing this is ncurses?), mc, and id3lib. I also find that many central european (french) symbols don't appear at all.
edit: I do recall id3lib has a utf8 patch available. That's what the error dialog says at least
Offline
It appears a recent update has fixed the UTF-8 support for MC.
On a system that hasn't been updated since late november, i need to use mc -a.
On my uptodate desktop and laptop i can start mc without the -a and get everything in place as it should be.
Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.
clean chroot building not flexible enough ?
Try clean chroot manager by graysky
Offline
It appears a recent update has fixed the UTF-8 support for MC.
On a system that hasn't been updated since late november, i need to use mc -a.
On my uptodate desktop and laptop i can start mc without the -a and get everything in place as it should be.
Indeed, that does seem to be the case - I wonder what did it!
Of course, if you want to actually deal with utf8 files, you'll still need to use mc-utf8 from [community], which seems to work pretty well.
larch: http://larch.berlios.de
Offline
I was fiddling with mc-mp (an mc spin-off), since it boasts of cleaner code and smaller memory footage.
mc-mp seems not to handle utf-8 (don't think it's a slang2 problem---can anyway verify this?). On the contrary, properly patched mc supports utf-8 well.
Sadly i had to move away from mc-mp...
Offline
In order to make aspell work correctly with UTF-8 the PKGBULID for aspell should be improved. Namely, the line
./configure --prefix=/usr
should be at least changed to
Code:
./configure --prefix=/usr --enable-curses=/usr/lib/libcursesw.so
Offline