You are not logged in.

#1 2024-11-18 05:05:05

dimich
Member
From: Kharkiv, Ukraine
Registered: 2009-11-03
Posts: 263

xterm-395-1 and utf-8 in window title

After upgrading xterm to 395-1 I noticed that non-ascii working directory names are displayed incorrectly in xterm window title:
Xd16.png

Window title is updated by PROMPT_COMMAND from default /etc/bash.bashrc, so I unset PROMPT_COMMAND and tried to update title directly:

$ printf "\033]2;Тест\007"

and observed the same issue: 8th bit is stripped from utf-8 characters.

Seems like this is effect of change mentioned in changelog:

minor fixes to work with vttest 20240929's 7-bit parsing test.

I tried to get utf-8 back by setting these resources in ~/.Xresources:
xterm*utf8Title: always
xterm*titleModes: 12
This has no effect. Of course, xrdb was reloaded after changes.

However, "xterm*titleModes: 3" allows to set non-ascii title if utf-8 string is converted to hex, e.g. with "xxd -p -c0".

Specifically for bash, I managed to get non-ascii titles displayed correctly by following changes:
In ~/.Xresources set "xterm*titleModes: 3"
In /etc/bash.bashrc change line for PROMPT_COMMAND to

PROMPT_COMMAND+=('printf "\033]0;%s\007" $(printf "%s@%s:%s" "${USER}" "${HOSTNAME%%.*}" "${PWD/#$HOME/\~}" | xxd -p -c0)')

But this is not a solution:
1. It fixes only bash working directory displaying.
2. It makes bash to spawn extra processes on every prompt.

I'd like to get backward-compatible behavior, as it was in xterm-394. The manual page says:

Each bit (bit “0” is 1, bit “1” is 2, etc.) corresponds to one of the parameters set by the title modes control sequence:

0
    Set window/icon labels using hexadecimal
1
    Query window/icon labels using hexadecimal
2
    Set window/icon labels using UTF-8 (gives the same effect as the utf8Title resource).
3
    Query window/icon labels using UTF-8

In value "12" bits 2 and 3 are set, so why "xterm*titleModes: 12" doesn't work? Is this a bug in xterm or I'm misunderstanding the documentation?

Offline

#2 2024-11-18 07:56:36

seth
Member
Registered: 2012-09-03
Posts: 58,698

Re: xterm-395-1 and utf-8 in window title

This might be

charsets.dat wrote:

+# VT520/VT525 manual p 7-2 explains "Russian" as KOI-7, though the dialect
+# is unknown.  Choose the one Kermit used.

Cross check

printf "\033]2;Ärchlinüx\007"
ctlseqs.txt wrote:

     Section 9 of ECMA-48, like DEC STD 070, chapter 3, goes into detail
-    to explain that when processing 7-bit controls, the eighth bit of
+    to explain that when processing 8-bit controls, the eighth bit of
     each byte is ignored.  This applies to the content of APC, DCS, OSC,
     and PM strings, as well as to the terminating bytes such as the two-
     byte string terminator.  Quoting from the latter, 3.5.4.5 GR Graphic
@@ -112,6 +112,23 @@
         7-bit equivalent.  (Note that this is the same way 8-bit graphic
         characters are handled within control sequences.)

+    The reason for that is because ECMA-48 presents 7-bit controls as an
+    alternative to 8-bit controls.  It says this:
+
+        The control functions defined in this Standard can be coded in a
+        7-bit code as well as in an 8-bit code; both forms of coded
+        representation are equivalent and in accordance with Standard
+        ECMA-35.
+
+    and in turn, ECMA-35 9.1 says
+
+        A 7-bit code shall have a structure which is based on a 7-bit
+        code table arranged in separate areas as follows (see figure 7):
+
+    In short, a standard-compliant implementation of ECMA-48 ignores the
+    eighth bit of bytes in control strings other than the C1 controls.
+    XTerm does this.

Offline

#3 2024-11-18 14:07:54

dimich
Member
From: Kharkiv, Ukraine
Registered: 2009-11-03
Posts: 263

Re: xterm-395-1 and utf-8 in window title

seth wrote:

Cross check

printf "\033]2;Ärchlinüx\007"

PROMPT_COMMAND is not set, no xterm*titleModes and xterm*utf8Title in resources:
Xd_l.png
The same result with xterm*titleModes=12, as well as xterm*utf8Title=always.

With xterm*titleModes=3 obviously the same commands doesn't work:
Xd_D.png

but it works as expected when string is in hex (just to prove that it's not an issue with window manager, etc):

printf "\033]2;$(printf "Ärchlinüx" | xxd -p -c0)\007"

Xd_k.png

LANG is "en_US.UTF-8".

seth wrote:

This might be

charsets.dat wrote:

+# VT520/VT525 manual p 7-2 explains "Russian" as KOI-7, though the dialect
+# is unknown.  Choose the one Kermit used.

There are no issues in displaying utf-8 by xterm itself in terminal.

Offline

#4 2024-11-18 15:37:40

seth
Member
Registered: 2012-09-03
Posts: 58,698

Re: xterm-395-1 and utf-8 in window title

The ctlseqs.txt diff doesn't look like but it's not the cyrillic table either.
I'm pretty sure dickey is the relevant one and I just updated xterm and can confirm the regression.

Offline

Board footer

Powered by FluxBB