You are not logged in.
My terminal, for some reason, is not displaying CJK characters properly. When pasted directly into the terminal, 汤 becomes as an empty square. Upon running the following command from a similar topic here, I get this output:
Command
FC_DEBUG=4 pango-view --font="Noto Sans Mono CJK SC" -t "汤" | grep family:Output
family: "Noto Sans Mono CJK SC"(s)
family: "Noto Sans Mono CJK SC"(s)
family: "Noto Sans Mono CJK SC"(s)
family: "Noto Sans Mono CJK SC"(s)
family: "Noto Sans Mono CJK SC"(s)
family: "Noto Sans Mono CJK SC"(s) "sans-serif"(w)
family: "Noto Sans Mono CJK SC"(s) "Noto Sans"(w) "DejaVu Sans"(w) "Verdana"(w) "Arial"(w) "Albany AMT"(w) "Luxi Sans"(w) "Nimbus Sans L"(w) "Nimbus Sans"(w) "Helvetica"(w) "Lucida Sans Unicode"(w) "BPG Glaho International"(w) "Tahoma"(w) "sans-serif"(w)
family: "Noto Sans Mono CJK SC"(s) "Noto Sans"(w) "DejaVu Sans"(w) "Verdana"(w) "Arial"(w) "Albany AMT"(w) "Luxi Sans"(w) "Nimbus Sans L"(w) "Nimbus Sans"(w) "Helvetica"(w) "Lucida Sans Unicode"(w) "BPG Glaho International"(w) "Tahoma"(w) "sans-serif"(w) "Roya"(w) "Koodak"(w) "Terafik"(w)
family: "Noto Sans Mono CJK SC"(s) "Noto Sans"(w) "DejaVu Sans"(w) "Verdana"(w) "Arial"(w) "Albany AMT"(w) "Luxi Sans"(w) "Nimbus Sans L"(w) "Nimbus Sans"(w) "Helvetica"(w) "Lucida Sans Unicode"(w) "BPG Glaho International"(w) "Tahoma"(w) "Nachlieli"(w) "Lucida Sans Unicode"(w) "Yudit Unicode"(w) "Kerkis"(w) "ArmNet Helvetica"(w) "Artsounk"(w) "BPG UTF8 M"(w) "Waree"(w) "Loma"(w) "Garuda"(w) "Umpush"(w) "Saysettha Unicode"(w) "JG Lao Old Arial"(w) "GF Zemen Unicode"(w) "Pigiarniq"(w) "B Davat"(w) "B Compset"(w) "Kacst-Qr"(w) "Urdu Nastaliq Unicode"(w) "Raghindi"(w) "Mukti Narrow"(w) "malayalam"(w) "Sampige"(w) "padmaa"(w) "Hapax Berbère"(w) "MS Gothic"(w) "UmePlus P Gothic"(w) "Microsoft YaHei"(w) "Microsoft JhengHei"(w) "WenQuanYi Zen Hei"(w) "WenQuanYi Bitmap Song"(w) "AR PL ShanHeiSun Uni"(w) "AR PL New Sung"(w) "MgOpen Modata"(w) "VL Gothic"(w) "IPAMonaGothic"(w) "IPAGothic"(w) "Sazanami Gothic"(w) "Kochi Gothic"(w) "AR PL KaitiM GB"(w) "AR PL KaitiM Big5"(w) "AR PL ShanHeiSun Uni"(w) "AR PL SungtiL GB"(w) "AR PL Mingti2L Big5"(w) "MS ゴシック"(w) "ZYSong18030"(w) "TSCu_Paranar"(w) "NanumGothic"(w) "UnDotum"(w) "Baekmuk Dotum"(w) "Baekmuk Gulim"(w) "KacstQura"(w) "Lohit Bengali"(w) "Lohit Gujarati"(w) "Lohit Hindi"(w) "Lohit Marathi"(w) "Lohit Maithili"(w) "Lohit Kashmiri"(w) "Lohit Konkani"(w) "Lohit Nepali"(w) "Lohit Sindhi"(w) "Lohit Punjabi"(w) "Lohit Tamil"(w) "Meera"(w) "Lohit Malayalam"(w) "Lohit Kannada"(w) "Lohit Telugu"(w) "Lohit Oriya"(w) "LKLUG"(w) "sans-serif"(w) "Roya"(w) "Koodak"(w) "Terafik"(w)
family: "Noto Sans Mono CJK SC"(s) "Noto Sans"(w) "DejaVu Sans"(w) "Verdana"(w) "Arial"(w) "Albany AMT"(w) "Luxi Sans"(w) "Nimbus Sans L"(w) "Nimbus Sans"(w) "Helvetica"(w) "Lucida Sans Unicode"(w) "BPG Glaho International"(w) "Tahoma"(w) "Nachlieli"(w) "Lucida Sans Unicode"(w) "Yudit Unicode"(w) "Kerkis"(w) "ArmNet Helvetica"(w) "Artsounk"(w) "BPG UTF8 M"(w) "Waree"(w) "Loma"(w) "Garuda"(w) "Umpush"(w) "Saysettha Unicode"(w) "JG Lao Old Arial"(w) "GF Zemen Unicode"(w) "Pigiarniq"(w) "B Davat"(w) "B Compset"(w) "Kacst-Qr"(w) "Urdu Nastaliq Unicode"(w) "Raghindi"(w) "Mukti Narrow"(w) "malayalam"(w) "Sampige"(w) "padmaa"(w) "Hapax Berbère"(w) "MS Gothic"(w) "UmePlus P Gothic"(w) "Microsoft YaHei"(w) "Microsoft JhengHei"(w) "WenQuanYi Zen Hei"(w) "WenQuanYi Bitmap Song"(w) "AR PL ShanHeiSun Uni"(w) "AR PL New Sung"(w) "MgOpen Modata"(w) "VL Gothic"(w) "IPAMonaGothic"(w) "IPAGothic"(w) "Sazanami Gothic"(w) "Kochi Gothic"(w) "AR PL KaitiM GB"(w) "AR PL KaitiM Big5"(w) "AR PL ShanHeiSun Uni"(w) "AR PL SungtiL GB"(w) "AR PL Mingti2L Big5"(w) "MS ゴシック"(w) "ZYSong18030"(w) "TSCu_Paranar"(w) "NanumGothic"(w) "UnDotum"(w) "Baekmuk Dotum"(w) "Baekmuk Gulim"(w) "KacstQura"(w) "Lohit Bengali"(w) "Lohit Gujarati"(w) "Lohit Hindi"(w) "Lohit Marathi"(w) "Lohit Maithili"(w) "Lohit Kashmiri"(w) "Lohit Konkani"(w) "Lohit Nepali"(w) "Lohit Sindhi"(w) "Lohit Punjabi"(w) "Lohit Tamil"(w) "Meera"(w) "Lohit Malayalam"(w) "Lohit Kannada"(w) "Lohit Telugu"(w) "Lohit Oriya"(w) "LKLUG"(w) "DejaVu Sans"(w) "Bitstream Vera Sans"(w) "WenQuanYi Zen Hei"(w) "sans-serif"(w) "Roya"(w) "Koodak"(w) "Terafik"(w)
family: "Noto Sans Mono CJK SC"(s) "Noto Sans"(w) "DejaVu Sans"(w) "Verdana"(w) "Arial"(w) "Albany AMT"(w) "Luxi Sans"(w) "Nimbus Sans L"(w) "Nimbus Sans"(w) "Helvetica"(w) "Lucida Sans Unicode"(w) "BPG Glaho International"(w) "Tahoma"(w) "Nachlieli"(w) "Lucida Sans Unicode"(w) "Yudit Unicode"(w) "Kerkis"(w) "ArmNet Helvetica"(w) "Artsounk"(w) "BPG UTF8 M"(w) "Waree"(w) "Loma"(w) "Garuda"(w) "Umpush"(w) "Saysettha Unicode"(w) "JG Lao Old Arial"(w) "GF Zemen Unicode"(w) "Pigiarniq"(w) "B Davat"(w) "B Compset"(w) "Kacst-Qr"(w) "Urdu Nastaliq Unicode"(w) "Raghindi"(w) "Mukti Narrow"(w) "malayalam"(w) "Sampige"(w) "padmaa"(w) "Hapax Berbère"(w) "MS Gothic"(w) "UmePlus P Gothic"(w) "Microsoft YaHei"(w) "Microsoft JhengHei"(w) "WenQuanYi Zen Hei"(w) "WenQuanYi Bitmap Song"(w) "AR PL ShanHeiSun Uni"(w) "AR PL New Sung"(w) "MgOpen Modata"(w) "VL Gothic"(w) "IPAMonaGothic"(w) "IPAGothic"(w) "Sazanami Gothic"(w) "Kochi Gothic"(w) "AR PL KaitiM GB"(w) "AR PL KaitiM Big5"(w) "AR PL ShanHeiSun Uni"(w) "AR PL SungtiL GB"(w) "AR PL Mingti2L Big5"(w) "MS ゴシック"(w) "ZYSong18030"(w) "TSCu_Paranar"(w) "NanumGothic"(w) "UnDotum"(w) "Baekmuk Dotum"(w) "Baekmuk Gulim"(w) "KacstQura"(w) "Lohit Bengali"(w) "Lohit Gujarati"(w) "Lohit Hindi"(w) "Lohit Marathi"(w) "Lohit Maithili"(w) "Lohit Kashmiri"(w) "Lohit Konkani"(w) "Lohit Nepali"(w) "Lohit Sindhi"(w) "Lohit Punjabi"(w) "Lohit Tamil"(w) "Meera"(w) "Lohit Malayalam"(w) "Lohit Kannada"(w) "Lohit Telugu"(w) "Lohit Oriya"(w) "LKLUG"(w) "DejaVu Sans"(w) "Bitstream Vera Sans"(w) "WenQuanYi Zen Hei"(w) "Noto Sans"(w) "sans-serif"(w) "Roya"(w) "Koodak"(w) "Terafik"(w)
family: "Noto Sans Mono CJK SC"(s) "Noto Sans"(w) "DejaVu Sans"(w) "Verdana"(w) "Arial"(w) "Albany AMT"(w) "Luxi Sans"(w) "Nimbus Sans L"(w) "Nimbus Sans"(w) "Helvetica"(w) "Lucida Sans Unicode"(w) "BPG Glaho International"(w) "Tahoma"(w) "Nachlieli"(w) "Lucida Sans Unicode"(w) "Yudit Unicode"(w) "Kerkis"(w) "ArmNet Helvetica"(w) "Artsounk"(w) "BPG UTF8 M"(w) "Waree"(w) "Loma"(w) "Garuda"(w) "Umpush"(w) "Saysettha Unicode"(w) "JG Lao Old Arial"(w) "GF Zemen Unicode"(w) "Pigiarniq"(w) "B Davat"(w) "B Compset"(w) "Kacst-Qr"(w) "Urdu Nastaliq Unicode"(w) "Raghindi"(w) "Mukti Narrow"(w) "malayalam"(w) "Sampige"(w) "padmaa"(w) "Hapax Berbère"(w) "MS Gothic"(w) "UmePlus P Gothic"(w) "Microsoft YaHei"(w) "Microsoft JhengHei"(w) "WenQuanYi Zen Hei"(w) "WenQuanYi Bitmap Song"(w) "AR PL ShanHeiSun Uni"(w) "AR PL New Sung"(w) "MgOpen Modata"(w) "VL Gothic"(w) "IPAMonaGothic"(w) "IPAGothic"(w) "Sazanami Gothic"(w) "Kochi Gothic"(w) "AR PL KaitiM GB"(w) "AR PL KaitiM Big5"(w) "AR PL ShanHeiSun Uni"(w) "AR PL SungtiL GB"(w) "AR PL Mingti2L Big5"(w) "MS ゴシック"(w) "ZYSong18030"(w) "TSCu_Paranar"(w) "NanumGothic"(w) "UnDotum"(w) "Baekmuk Dotum"(w) "Baekmuk Gulim"(w) "KacstQura"(w) "Lohit Bengali"(w) "Lohit Gujarati"(w) "Lohit Hindi"(w) "Lohit Marathi"(w) "Lohit Maithili"(w) "Lohit Kashmiri"(w) "Lohit Konkani"(w) "Lohit Nepali"(w) "Lohit Sindhi"(w) "Lohit Punjabi"(w) "Lohit Tamil"(w) "Meera"(w) "Lohit Malayalam"(w) "Lohit Kannada"(w) "Lohit Telugu"(w) "Lohit Oriya"(w) "LKLUG"(w) "DejaVu Sans"(w) "Bitstream Vera Sans"(w) "WenQuanYi Zen Hei"(w) "Noto Sans"(w) "FreeSans"(w) "Arial Unicode MS"(w) "Arial Unicode"(w) "Code2000"(w) "Code2001"(w) "sans-serif"(w) "Roya"(w) "Koodak"(w) "Terafik"(w)
family: "Noto Sans Mono CJK SC"(s) "Noto Sans"(w) "DejaVu Sans"(w) "Verdana"(w) "Arial"(w) "Albany AMT"(w) "Luxi Sans"(w) "Nimbus Sans L"(w) "Nimbus Sans"(w) "Helvetica"(w) "Lucida Sans Unicode"(w) "BPG Glaho International"(w) "Tahoma"(w) "Nachlieli"(w) "Lucida Sans Unicode"(w) "Yudit Unicode"(w) "Kerkis"(w) "ArmNet Helvetica"(w) "Artsounk"(w) "BPG UTF8 M"(w) "Waree"(w) "Loma"(w) "Garuda"(w) "Umpush"(w) "Saysettha Unicode"(w) "JG Lao Old Arial"(w) "GF Zemen Unicode"(w) "Pigiarniq"(w) "B Davat"(w) "B Compset"(w) "Kacst-Qr"(w) "Urdu Nastaliq Unicode"(w) "Raghindi"(w) "Mukti Narrow"(w) "malayalam"(w) "Sampige"(w) "padmaa"(w) "Hapax Berbère"(w) "MS Gothic"(w) "UmePlus P Gothic"(w) "Microsoft YaHei"(w) "Microsoft JhengHei"(w) "WenQuanYi Zen Hei"(w) "WenQuanYi Bitmap Song"(w) "AR PL ShanHeiSun Uni"(w) "AR PL New Sung"(w) "MgOpen Modata"(w) "VL Gothic"(w) "IPAMonaGothic"(w) "IPAGothic"(w) "Sazanami Gothic"(w) "Kochi Gothic"(w) "AR PL KaitiM GB"(w) "AR PL KaitiM Big5"(w) "AR PL ShanHeiSun Uni"(w) "AR PL SungtiL GB"(w) "AR PL Mingti2L Big5"(w) "MS ゴシック"(w) "ZYSong18030"(w) "TSCu_Paranar"(w) "NanumGothic"(w) "UnDotum"(w) "Baekmuk Dotum"(w) "Baekmuk Gulim"(w) "KacstQura"(w) "Lohit Bengali"(w) "Lohit Gujarati"(w) "Lohit Hindi"(w) "Lohit Marathi"(w) "Lohit Maithili"(w) "Lohit Kashmiri"(w) "Lohit Konkani"(w) "Lohit Nepali"(w) "Lohit Sindhi"(w) "Lohit Punjabi"(w) "Lohit Tamil"(w) "Meera"(w) "Lohit Malayalam"(w) "Lohit Kannada"(w) "Lohit Telugu"(w) "Lohit Oriya"(w) "LKLUG"(w) "DejaVu Sans"(w) "Bitstream Vera Sans"(w) "WenQuanYi Zen Hei"(w) "Noto Sans"(w) "FreeSans"(w) "Arial Unicode MS"(w) "Arial Unicode"(w) "Code2000"(w) "Code2001"(w) "sans-serif"(w) "Roya"(w) "Koodak"(w) "Terafik"(w)
family: "Noto Sans Mono CJK SC"(s)
family: "Noto Sans Mono CJK SC"(s)In addition, a window to the right displays the glyph correctly. So I suppose the font is installed correctly, and I have been testing with another locale (Hindi) which I thought may work, but this does not display correctly in the terminal either (but does work with the command above).
locale -aproduces:
C
C.UTF-8
en_GB.utf8
en_US.utf8
hi_IN
hi_IN.utf8
ja_JP.utf8
POSIXTo add: Having just now tested the kanji 湯 in urxvt, this does display properly but a word in Hindi like इनसान does not display.
Any help would be greatly appreciated, thank you.
Last edited by 山猿 (2023-07-27 21:20:02)
Offline
urxvt doesn't use fontconfig
xrdb -q | grep -iE 'rxvt.*font'urxvt -fn "xft:Noto Sans Mono:size=10,xft:Noto Sans Mono CJK SC:size=8" # nb. the deliberate size difference - for nowOnline
urxvt doesn't use fontconfig
xrdb -q | grep -iE 'rxvt.*font'urxvt -fn "xft:Noto Sans Mono:size=10,xft:Noto Sans Mono CJK SC:size=8" # nb. the deliberate size difference - for now
Thank you for your help, and my apologies for my mistakes.
xrdb -q | grep -iE 'rxvt.*font'produces:
xft:mononoki:size=12And
urxvt -fn "xft:Noto Sans Mono:size=10,xft:Noto Sans Mono CJK SC:size=8" # nb. the deliberate size difference - for nowproduces a new urxvt window and when pasting in the 汤 glyph, it does show correctly in the terminal (albeit with the size difference you mention).
Last edited by 山猿 (2023-07-27 15:35:51)
Offline
https://wiki.archlinux.org/title/Rxvt-u … on_methods
https://wiki.archlinux.org/title/Rxvt-u … Xresources
You can test whether you can use the same font size or try https://aur.archlinux.org/packages/rxvt … wideglyphs (notice the patch I posted there)
Online
https://wiki.archlinux.org/title/Rxvt-u … on_methods
https://wiki.archlinux.org/title/Rxvt-u … XresourcesYou can test whether you can use the same font size or try https://aur.archlinux.org/packages/rxvt … wideglyphs (notice the patch I posted there)
Thank you very much for these links, they helped me to solve my problem and figure out what was causing the problem.
I found that the font I was using, mononoki, does not support Japanese glyphs but only Latin characters. After doing some searching, I found that IBM Plex supports these glyphs (and many other scripts, which is useful for me) in a single mono font and so installed this locally.
I then changed .Xresources to have the following line:
...
URxvt.font: xft:IBM Plex Mono,IBM Plex Mono ExtLt:size=12
...After restarting the system, the terminal displays the font correctly (confirmed by running neofetch).
Now, when pasting in the 汤 glyph, or a kanji like 山, or even Hangeul like 한글, this all works correctly. I also managed to configure ibus and mozc such that I can change input methods between Japanese and English with a key combination; this works on all applications I have tested on, including urxvt (I do have a problem relating to starting ibus-daemon at boot, but I will continue further research and perhaps make a separate topic on this later).
I do have one outstanding problem, unfortunately.
Importantly, given IBM Plex has support for a number of scripts (which can be tested here), I wanted to also be able to display the ones I may require. For instance, the website cited can display ภาษาไทย without any rendering issues, but pasting this set of Thai characters into urxvt renders empty squares.
I also tested this by pasting the Devanagari characters मानक हिन्दी given that locale -a displays the following output:
C
C.UTF-8
en_GB.utf8
en_US.utf8
hi_IN
hi_IN.utf8
ja_JP.utf8
POSIXOn IBM's type tester, मानक हिन्दी works without issue but urxvt only renders empty squares. Given the font declaration in .Xresources appears to work (and show Japanese glyphs), I am not sure why characters from, for instance, Hindi, are not rendering. If you could provide any pointers with respect to this, I would really appreciate it.
Last edited by 山猿 (2023-07-27 19:15:32)
Offline
Your locale is irrelevat to all of this, a non-utf8 locale would lead to encoding issues, but that looks like "ü"
You font config adds "IBM Plex Mono ExtLt" (which looks like it's just a light variant) to "IBM Plex Mono" and it does quite frankly not look like either of them would provide any cjk glyphs - you probably want to add the Devanagari font to that.
Also the Devanagari stuff seems to be some ligatures? That's not gonna work.
You can search for installed fonts supporting the glyph you're interested in:
fc-list :charset=<utf-8 codepoint here>Eg.
fc-list :charset=0e20 # ภ
fc-list :charset=092e # म - I have no idea whether that actually relates to the ligatures you postedLast edited by seth (2023-07-27 20:24:28)
Online
Your locale is irrelevat to all of this, a non-utf8 locale would lead to encoding issues, but that looks like "ü"
You font config adds "IBM Plex Mono ExtLt" (which looks like it's just a light variant) to "IBM Plex Mono" and it does quite frankly not look like either of them would provide any cjk glyphs - you probably want to add the Devanagari font to that.
Also the Devanagari stuff seems to be some ligatures? That's not gonna work.
You can search for installed fonts supporting the glyph you're interested in:
fc-list :charset=<utf-8 codepoint here>
Thank you very much for your reply and help thus far.
IBM Plex Mono ExtLt is a light variant of IBM Plex Mono and you are correct that when I run, for instance:
fc-list :charset=3044Which is the codepoint for い , it does produce a number of fonts, but seemingly not the IBM one. I double checked this by running the following (which yielded no results):
fc-list :charset=3044 | grep ".*IBM.*"Interestingly, though, when I paste in or input い via the ibus/mozc IME, this does render on the terminal; but I suspect this is using a different font to do so, perhaps (I think given that it renders in a different colour and weight to everything else on the terminal, this is likely)?
Devanagari, being a language which is ligature based, I suppose will not render on the terminal then, though the above command with an Indic (Devanagari) code point does list a number of fonts installed. Just to check, I did try a few ways to force rendering, like:
printf '\xE0\xA4\x84But this simply renders a half width empty square, though copying and pasting this into, say, a browser of course yields the actual character ऄ.
I will now test with one of the mono fonts which did appear to be able to handle CJK glyphs and update this thread once I have done so. Thank you again very much for your help with this.
Last edited by 山猿 (2023-07-27 20:57:54)
Offline
This is just to confirm that the terminal now displays glyphs correctly, without (I suspect) approximating the font from another installed font that does support the UTF-8 code point and I can both input and paste CJK characters correctly, without any empty squares.
Thank you once again for your help with this seth, it was very valuable to me and appreciated.
Offline
Ligatures kinda conflict w/ the concept of "monospace" but if you care, I copypasted "ऄ" into https://archlinux.org/packages/extra/x86_64/alacritty/ and that looks "ok" to me.
xterm eg. otoh seems to render only the first part of it.
Online
Ligatures kinda conflict w/ the concept of "monospace" but if you care, I copypasted "ऄ" into https://archlinux.org/packages/extra/x86_64/alacritty/ and that looks "ok" to me.
xterm eg. otoh seems to render only the first part of it.
Thank you, that is actually quite valuable to know and I may try further experimentation with alacritty for this purpose; I only need to render certain text which uses script like this, so anything which provides a semblance of the character is very useful. I was unaware about the conceptual conflict with monospace and ligatures, though that does make sense, now that you have mentioned it.
Thank you again.
Offline