You are not logged in.

#1 2017-09-05 13:05:58

alma ata
Member
Registered: 2017-08-27
Posts: 26

[solved] file command not working

when i do the file command it fails to display the encoding of the text files

example. I have this folder

Booklet-1.jpg:                               image/jpeg; charset=binary
Booklet-2.jpg:                               image/jpeg; charset=binary
Booklet-3.jpg:                               image/jpeg; charset=binary
Booklet-4.jpg:                               image/jpeg; charset=binary
Booklet-5.jpg:                               image/jpeg; charset=binary
Roland Kayn - Tektra - cd 1.jpg:             image/jpeg; charset=binary
Roland Kayn - Tektra - cd 2.jpg:             image/jpeg; charset=binary
Roland Kayn - Tektra - cd 3.jpg:             image/jpeg; charset=binary
Roland Kayn - Tektra - cd 4.jpg:             image/jpeg; charset=binary
Roland Kayn - Tektra - front and inside.jpg: image/jpeg; charset=binary
Roland Kayn - Tektra - front.jpg:            image/jpeg; charset=binary
Roland Kayn - Tektra - inside 2.jpg:         image/jpeg; charset=binary
Roland Kayn - Tektra - inside 3.jpg:         image/jpeg; charset=binary
Roland Kayn - Tektra - inside 4.jpg:         image/jpeg; charset=binary
Roland Kayn - Tektra - inside 5.jpg:         image/jpeg; charset=binary
Roland Kayn - Tektra - inside 6.jpg:         image/jpeg; charset=binary
Roland Kayn-Tektra.txt:                      text/plain; charset=us-ascii
Tektra-cover.jpg:                            image/jpeg; charset=binary
nuovo.txt:                                   text/plain; charset=us-ascii

it says that the encoding of the text files is us-ascii but it should display UTF-8, at least for the file nuovo.txt that I encoded in UTF-8 to see if the file command worked properly

Last edited by alma ata (2017-09-06 08:45:38)

Offline

#2 2017-09-05 13:14:00

ayekat
Member
Registered: 2011-01-17
Posts: 1,631

Re: [solved] file command not working

Does nuovo.txt contain any UTF-8 characters? If it only contains characters from the original ASCII set (given that UTF-8 is compatible with ASCII for the first 127 characters), `file` will simply detect it as ASCII.


pkgshackscfgblag

Offline

#3 2017-09-05 13:42:57

alma ata
Member
Registered: 2017-08-27
Posts: 26

Re: [solved] file command not working

ayekat wrote:

Does nuovo.txt contain any UTF-8 characters? If it only contains characters from the original ASCII set (given that UTF-8 is compatible with ASCII for the first 127 characters), `file` will simply detect it as ASCII.

it contains these characters

Roland Kayn - Tektra (1980-82), Cybernetic Music
4-CD-Box, Label: Barooni
I scanned all the booklet pages for your pleasure in very good quality,
so you can read/print them out (in b/w to keep size small). theres lots of information about his
works starting in 1950, and what his cybernetic music is about...
CD 1:
1. Tanar 1
2. Tanar 2
3. Etoral
CD 2:
1. Khyra 1
2. Khyra 2
3. Khyra 3
CD 3:
1. Tarego 1
2. Tarego 2
3. Tarego 3
4. Rhenit
CD 4:
1. Amarun 1
2. Amarun 2-I
3. Amarun 2-II
uploaded in 10
file under 20c-electroaccustic
from vogel
enjoy it...


but i got some doubts because gvim doesn't read it well, it doesn't begin new lines but when it should it shows the "^M" characters

Last edited by alma ata (2017-09-05 13:43:22)

Offline

#4 2017-09-05 13:50:02

ayekat
Member
Registered: 2011-01-17
Posts: 1,631

Re: [solved] file command not working

That file looks like ASCII-only.

About the ^M, please provide the output of this command:

xxd nuovo.txt | head

pkgshackscfgblag

Offline

#5 2017-09-05 14:34:39

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 73,194

Re: [solved] file command not working

Utf-8 and ASCII are identical on the lower 7 bits, so there's nothing to worry here. Your newline issue is unrelated to that.
(If you want it to be detected utf-8, you need to add a unicode BOM, but that will knock out some editors.)

"^M" means CR/LF (and no NL) - very likely the result of a windows editor.
Try "pacman -S dos2unix".

Offline

#6 2017-09-05 15:13:24

alma ata
Member
Registered: 2017-08-27
Posts: 26

Re: [solved] file command not working

ayekat wrote:

That file looks like ASCII-only.

About the ^M, please provide the output of this command:

xxd nuovo.txt | head
00000000: 526f 6c61 6e64 204b 6179 6e20 2d20 5465  Roland Kayn - Te
00000010: 6b74 7261 2028 3139 3830 2d38 3229 2c20  ktra (1980-82),
00000020: 4379 6265 726e 6574 6963 204d 7573 6963  Cybernetic Music
00000030: 0d34 2d43 442d 426f 782c 204c 6162 656c  .4-CD-Box, Label
00000040: 3a20 4261 726f 6f6e 690d 4920 7363 616e  : Barooni.I scan
00000050: 6e65 6420 616c 6c20 7468 6520 626f 6f6b  ned all the book
00000060: 6c65 7420 7061 6765 7320 666f 7220 796f  let pages for yo
00000070: 7572 2070 6c65 6173 7572 6520 696e 2076  ur pleasure in v
00000080: 6572 7920 676f 6f64 2071 7561 6c69 7479  ery good quality
00000090: 2c0d 736f 2079 6f75 2063 616e 2072 6561  ,.so you can rea
seth wrote:

Your newline issue is unrelated to that.

is related to gvim
other text editors read the file properly

"^M" means CR/LF (and no NL) - very likely the result of a windows editor.

yes, i think so

Offline

#7 2017-09-05 15:20:05

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 73,194

Re: [solved] file command not working

Nope, MacOS - there's only a CR, no LF
You can sed or tr \r to \n to fix this.

I'm pretty sure gvim can handle CR-only - at least vim can.
http://vim.wikia.com/wiki/File_format

Offline

#8 2017-09-05 15:20:12

ayekat
Member
Registered: 2011-01-17
Posts: 1,631

Re: [solved] file command not working

Actually, ^M means just CR. You can see ^M at the end of a line if it's DOS-formatted (CRLF), but in your case (no linebreak), it's just CR (as can be seen in the xxd output).

That's pretty odd - if I'm not mistaken, old Mac OS versions (<=9) used to have that, but AFAIK they switched to UNIX-style linebreaks (LF) from OS X on. How was that file created, exactly?

The dos2unix package seth mentioned earlier comes with a `mac2unix` command; you could try to fix it with that.

--edit: Before you fix the file, what's the output of this?

xxd nuovo.txt | tail

Perhaps if the file is only CR or only CRLF throughout, (g)vim handles it correctly, otherwise it will start doing weird stuff as can be seen here.

Last edited by ayekat (2017-09-05 15:22:43)


pkgshackscfgblag

Offline

#9 2017-09-05 15:34:59

alma ata
Member
Registered: 2017-08-27
Posts: 26

Re: [solved] file command not working

ayekat wrote:

Actually, ^M means just CR. You can see ^M at the end of a line if it's DOS-formatted (CRLF), but in your case (no linebreak), it's just CR (as can be seen in the xxd output).

That's pretty odd - if I'm not mistaken, old Mac OS versions (<=9) used to have that, but AFAIK they switched to UNIX-style linebreaks (LF) from OS X on. How was that file created, exactly?

The dos2unix package seth mentioned earlier comes with a `mac2unix` command; you could try to fix it with that.

--edit: Before you fix the file, what's the output of this?

xxd nuovo.txt | tail

Perhaps if the file is only CR or only CRLF throughout, (g)vim handles it correctly, otherwise it will start doing weird stuff as can be seen here.

00000190: 6f20 310d 322e 2054 6172 6567 6f20 320d  o 1.2. Tarego 2.
000001a0: 332e 2054 6172 6567 6f20 330d 342e 2052  3. Tarego 3.4. R
000001b0: 6865 6e69 740d 4344 2034 3a0d 312e 2041  henit.CD 4:.1. A
000001c0: 6d61 7275 6e20 310d 322e 2041 6d61 7275  marun 1.2. Amaru
000001d0: 6e20 322d 490d 332e 2041 6d61 7275 6e20  n 2-I.3. Amarun
000001e0: 322d 4949 0d75 706c 6f61 6465 6420 696e  2-II.uploaded in
000001f0: 2031 300d 6669 6c65 2075 6e64 6572 2032   10.file under 2
00000200: 3063 2d65 6c65 6374 726f 6163 6375 7374  0c-electroaccust
00000210: 6963 0d66 726f 6d20 766f 6765 6c0d 656e  ic.from vogel.en
00000220: 6a6f 7920 6974 2e2e 2e                   joy it...

yes vi and vim don't show this file correctly

i don't know how they created this file, i found the file when i downloaded a music album

as i said other text editors like mousepad have no problm with this file, gvim does

Offline

#10 2017-09-05 15:58:19

ayekat
Member
Registered: 2011-01-17
Posts: 1,631

Re: [solved] file command not working

Alright, it kind of makes sense now: you used the `--mime` option, so you couldn't see the comment of `file`:

$ printf 'Hello\rWorld' > test.txt
$ xxd test.txt
00000000: 4865 6c6c 6f0d 576f 726c 64              Hello.World
$ file test.txt
test.txt: ASCII text, with CR line terminators
$ file --mime test.txt
test.txt: text/plain; charset=us-ascii

Openend in vim, it does indeed show the carriage returns as ^M.
However, as explained in the article linked by seth, if the `ffs` option contains `mac` (not by default), the line endings should be displayed correctly:

  1. Run vim

  2. :set ffs=unix,dos,mac

  3. :e nuovo.txt

  4. tadaaa!


pkgshackscfgblag

Offline

#11 2017-09-05 19:49:28

alma ata
Member
Registered: 2017-08-27
Posts: 26

Re: [solved] file command not working

ayekat wrote:

but in your case (no linebreak), it's just CR (as can be seen in the xxd output).

how did you see it? i'm not into xxd

ayekat wrote:

Alright, it kind of makes sense now: you used the `--mime` option, so you couldn't see the comment of `file`:

i didn't use the -mime option
where did you see that i used that option?


  1. Run vim

  2. :set ffs=unix,dos,mac

  3. :e nuovo.txt

  4. tadaaa!

this method works
other text editors don't need anything more
vim needs this workaround

tomorrow i will read tha article posted above

Last edited by alma ata (2017-09-05 20:52:52)

Offline

#12 2017-09-05 20:00:54

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 73,194

Re: [solved] file command not working

setting ffs is required because it's set through the "nocompatible" call in /usr/share/vim/vimfiles/archlinux.vim
CR is 0d, LF is 0a

Offline

#13 2017-09-06 06:48:29

ayekat
Member
Registered: 2011-01-17
Posts: 1,631

Re: [solved] file command not working

alma ata wrote:

how did you see it? i'm not into xxd

xxd prints a hexadecimal representation of data alongside with a "human-readable" representation next to it. When looking at the "newlines", we can see the following:

00000210: 6963 0d66 726f 6d20 766f 6765 6c0d 656e  ic.from vogel.en
               ^^                         ^^         ^          ^

There is also the `hexdump` command, which performs a similar task.

i didn't use the -mime option

See my invocation of `file` again: I need to pass --mime in order to get an output like

nuovo.txt: text/plain; charset=us-ascii

Check the output of

which file
pacman -Qo $(which file)

I suspect you have either defined an alias somewhere, or you are using a different version of `file`.

seth wrote:

setting ffs is required because it's set through the "nocompatible" call in /usr/share/vim/vimfiles/archlinux.vim

When using nocompatible, ffs is set to `unix,dos` on my machine by default (and the file is not properly displayed either).

ayekat goes and fixes this in his vimrc. --edit done

Last edited by ayekat (2017-09-06 07:20:47)


pkgshackscfgblag

Offline

#14 2017-09-06 06:55:32

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 73,194

Offline

#15 2017-09-06 08:05:24

alma ata
Member
Registered: 2017-08-27
Posts: 26

Re: [solved] file command not working

ayekat wrote:

See my invocation of `file` again: I need to pass --mime in order to get an output like

nuovo.txt: text/plain; charset=us-ascii

Check the output of

which file
pacman -Qo $(which file)

I suspect you have either defined an alias somewhere, or you are using a different version of `file`.

yes now I recognize that i've used the -mime option. I needed it to see the encoding
I thought to find something like UTF-8 or ISO-etc...  but I've only found us-ascii

Last edited by alma ata (2017-09-06 08:05:37)

Offline

#16 2017-09-06 08:25:55

ayekat
Member
Registered: 2011-01-17
Posts: 1,631

Re: [solved] file command not working

Please don't forget to mark your thread as solved by editing your first post and prepeding [SOLVED] to its title.


pkgshackscfgblag

Offline

Board footer

Powered by FluxBB