You are not logged in.
Pages: 1
If you watch binary data in the terminal, it'll typically behave strange.
Example:
In the above example, two things act "strange" after viewing the binary data:
-the word "lode" in red became garbage
-any input I type becomes garbage
Typing reset fixes it.
Since this appears to be normal behavior in linux (nobody appears to consider this a bug), but the terminal still allows you to view binary data, I wonder: is this specified? Is it specified which binary data symbol makes the terminal start displaying your input as weird characters? If nothing is specified about it, what's the reason why the makers of bash chose to let their terminal behave like this and not do "normal" after displaying binary data?
Offline
Usually a C program manipulating strings goes haywire if it processes a sequence with NUL in it (because NUL is the string terminator in C).
Bash is written in C.
Feel free to end the syllogism
If nothing is specified about it, what's the reason why the makers of bash chose to let their terminal behave like this and not do "normal" after displaying binary data?
It's due to the way C behaves with strings.
Last edited by carlocci (2008-09-21 21:51:57)
Offline
Uh... no. No no no. C irks have nothing to do with this - I don't think so, at least.
Within terminal emulators, there are 3 character table sets, G0, G1 and G2, each can contain a set of character glyphs, and one of these is loaded into GL so it's the one that's actively used. G0 is the set you typically have loaded. Certain escape sequences can switch the character table into and out of different modes; the mode you're describing above is called the "box drawing" or "line drawing" or "special characters" mode. This mode remaps certain characters to different glyphs so that operations such as drawing of box characters can be done by simply echoing ASCII characters to the screen.
Different environments require different behavior to select these drawing characters and put them into play. In X11, one needs to switch the mode desired into G0. At the console, G1 is typically already loaded with the box drawing character set, so all that's required is to switch GL into G1.
Where ^[ is meant to mean the escape character, ASCII code 27...
The sequence ^[(0 switches G0 into line drawing mode, as applicable for X11, and can be sent to the terminal via
echo -e '\e(0'
The sequence ^[(B switches G0 back into standard or normal mode, and can similarly be sent via
echo -e '\e(B'
In console mode, the key combination CTRL+N will send the ^N sequence to the terminal (defined in ANSI as SO, defined in POSIX to mean LS1), switching it to the G1 character set, and CTRL+O will send what is defined in ANSI as SO but defined in POSIX as LS0, switching it back to G1. However, these key combinations are interpreted to mean other things when pressed in a terminal environment, and only take effect when their character code counterparts (SO or LS1, which is 14 decimal, 0xE hexadecimal, or 016 octal, and SI or LS0, which is 15 / 0xF / 017), are echoed to the terminal. To achieve this, try the commands below (where words contained in < and > are intended to be pressed as key combos). To echo SO, send
echo <CTRL+V><CTRL+N>
And to echo SI, send:
echo <CTRL+V><CTRL+O>
Therefore, When you cat a binary stream of data, there is a high chance the sequence 27, 40, 48 (switch G1 into GL), 27, 40, 66 (switch G0 into GL), 14 (SO) or 15 (SI) is likely to occur, in which case your terminal obeys these perfectly valid escape sequences, switching into or out of the box drawing character set and displaying whatever data proceeds these commands using that character set.
The chances of this occuring are threefold in console mode, since thise mode by default requires a 3rd of the chars required by X11 terminal emulators to switch to the box drawing character set, so console mode is more likely to be susceptible to this issue.
It is for this reason that terminals may also clear (the UNIX 'clear' command typically sends a longer sequence, but the sequence 27 99 appears to erase the display) and I have personally seen this sequence sequence sent to my terminal at least once), or exhibit other odd side effects - the result of different escape sequences that are being sent to the terminal.
A fun example:
When in box drawing mode, the following keys are mapped to the following alternate box drawing glyphs:
j bottom right
m bottom left
k top right
l top left
q horiz line
x vert line
With that knowledge, run the following code and observe how a square box is drawn on the screen.
echo -e '\e(0lqqqqqqqqqqk\nx x\nx x\nx x\nmqqqqqqqqqqj\e(B'
Or, a little more spaced out (indentation added purely for conciseness):
echo -e '\e(0'
echo lqqqqqqqqqqk
echo x x
echo x x
echo x x
echo mqqqqqqqqqqj
echo '\e(B'
Yes, that's how ncurses works too, in case you wondered.
Also, "reset" can be replaced by "tput reset" - this sends the same escape sequences (more escape sequences!) as "reset", but doesn't delay at all.
References:
- the "console_codes" manpage
- man dtterm(5), not available in Arch but mirrored at this ancient HP web url: http://h30097.www3.hp.com/docs/base_doc … 00____.HTM
For curious minds:
- Piping the output of something that uses escape sequences through a script like this PHP code (where " \e" is what you want to replace the escape character to):
php -r 'echo str_replace(chr("27"), " \e", `tput reset`);'; echo
is an easy way to investigate what sequences are being sent to the terminal. Simple commands like "clear", "tput reset" ("reset" sends some signals directly to the terminal emulator process itself I think, NOT escape sequences, so is beyond the scope of this post), and so on are good starting points.
- Additionally, you can substitute editing of the output stream of an application via PHP or another language with redirection. Redirecting the output of a program to a file is a good way to poke about inside the file, although beware that the files created with this method should only be looked at with simple editors like nano or e3, which don't try to have a go at figuring out the file content and/or "cleaning" it, like vi(m) or emacs might. NOTE that even when you're redirecting the output of a program to a file, it still accepts input! If your application for example accepts the key F10 to quit it, press F10 after you think your app is done loading, and the command you ran to start the app (for example "mc > mc-output") should quit.
- Ncurses applications are a good place to learn about escape sequences without diving into sourcecode because the ncurses library uses a lot of undocumented escape sequences which do interesting things. You'll almost certainly want to redirect the output of the command to a file as per the method above, and additionally be prepared to do a LOT of digging around inside the output, as the escape sequences will be interspersed with a LOT of other characters, those being the program's perfectly normal output.
Whew, that was quite a post. I'm glad that I had a good memory when I was somewhere between 8 to 11, when I learnt about DOS escape sequences. That was 9 to 6 years ago, and the knowledge has helped me handle UNIX's infentesimally more complex escape sequences and find them actually learnable.
EDIT: Updated a bit of the text, fixed a typo, added more info
-dav7
Last edited by dav7 (2008-09-23 23:47:25)
Windows was made for looking at success from a distance through a wall of oversimplicity. Linux removes the wall, so you can just walk up to success and make it your own.
--
Reinventing the wheel is fun. You get to redefine pi.
Offline
Thanks dav7, I've used Unix based systems for the better part of 13 years, and I never knew that Learn something everyday
Offline
If nothing is specified about it, what's the reason why the makers of bash chose to let their terminal behave like this and not do "normal" after displaying binary data?
It's because there's no way for the terminal to know you're displaying "binary data". It's just letters and characters, and how is the terminal supposed to know whether you're looking at ASCII art, text, or gibberish?
Offline
Thanks a lot dav7, that's the best explanation I could have imagined
Offline
Thank you dav7 for correcting my wrong explanation: that was amusing to read
Offline
Good information, that explains some behaviour on the serial console
Offline
I've wondered this myself. Thanks for all the info -- it makes an excellent bookmark.
Offline
Thanks for this explanation, I didn't know about this. (I, too, made a bookmark.)
Offline
Wow, cool
I just came back to fish the box drawing chars out of my own post and found all these unexpected responses
-dav7
Last edited by dav7 (2008-09-23 12:45:34)
Windows was made for looking at success from a distance through a wall of oversimplicity. Linux removes the wall, so you can just walk up to success and make it your own.
--
Reinventing the wheel is fun. You get to redefine pi.
Offline
Wow, it's on reddit/linux's main page !!
Last edited by Onwards (2008-09-23 14:18:51)
Offline
So why doesn't this happen with xterm, urxvt or gnome-terminal ?
The day Microsoft makes a product that doesn't suck, is the day they make a vacuum cleaner.
--------------------------------------------------------------------------------------------------------------
But if they tell you that I've lost my mind, maybe it's not gone just a little hard to find...
Offline
Onwards: Wow
moljac024: It does. All terminal emulators support escape sequences, and only very few (read: early video display-based systems from around the 1950s to the 1970s, maybe a few rare others) didn't support the box drawing character set. But all modern X terminal emulators do support it.
Note: I'm not entirely sure if LS0 and LS1 are defined in POSIX. I just guessumed that they were.
-dav7
Last edited by dav7 (2008-09-23 15:20:34)
Windows was made for looking at success from a distance through a wall of oversimplicity. Linux removes the wall, so you can just walk up to success and make it your own.
--
Reinventing the wheel is fun. You get to redefine pi.
Offline
The box drawing characters do work on urxvt, but after 'cat /dev/random', the terminal comes back to normal. I thought it was the shell I use (zsh), but on vc it doesn't come back to normal.
(lambda ())
Offline
True, I tried cat'ing binary files in urxvt and didn't see that behavior.
But I did get intermittent clears, and a few subtle alterations to my keymap, like I could only type slashes with numlock on.
Last edited by Arkane (2008-09-23 15:27:57)
What does not kill you will hurt a lot.
Offline
It behaves different on a different computer of mine! New screenshot - different looking character set, and the red prompt always looks the same here
Offline
This time, you are using UTF-8 encoding. With recent installations, I have never had trouble with displaying binary data in terminal.
The difference makes the difference.
Offline
Pages: 1