You are not logged in.

#1 2016-06-08 09:56:57

gladixy
Member
Registered: 2014-03-12
Posts: 15

[Solved] Copying file over ssh -> Encoding mixed up

Hey,

I have this text file on a server.

file -i histories.txt

gives "histories.txt: text/plain; charset=utf-8".

When I run

scp remoteServer:histories.txt .

and run 

file -i histories.txt

on my local machine I'm getting "histories.txt: application/octet-stream; charset=binary".

Output of

locale

on the remote machine:

LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Output of 

locale

on the local machine:

LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Can somebody help me or give me a pointer?

Last edited by gladixy (2016-06-08 22:09:11)

Offline

#2 2016-06-08 10:14:03

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: [Solved] Copying file over ssh -> Encoding mixed up

Have you checked if the md5sum or any other hash matches between the file on the server and the one you just copied?


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

#3 2016-06-08 10:56:20

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 29,523
Website

Re: [Solved] Copying file over ssh -> Encoding mixed up

If the checksum above is the same, then check the magic numbers on each machine:

xxd -l2 histories.txt

It may be that the files are the same but different versions of `file` on different distros are interpreting the filetype differently: one recognizing the magic numbers, the other not and falling back on the filename.


"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#4 2016-06-08 11:15:00

gladixy
Member
Registered: 2014-03-12
Posts: 15

Re: [Solved] Copying file over ssh -> Encoding mixed up

Thank you for the quick replies.

Both, md5sum and xdd give identical results. I didn't know xdd before. It gives

00000000: 2873                                     (s

. The (s looks a bit weird on first sight.

I noticed the following: When I create a text file by hand (e.g. echo some > myFile) then this file is copied with the charset being preserved. So I am assuming it must have something to do with the content of histories.txt.

Last edited by gladixy (2016-06-08 11:19:12)

Offline

#5 2016-06-08 11:18:27

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 29,523
Website

Re: [Solved] Copying file over ssh -> Encoding mixed up

gladixy wrote:

So I am assuming it must have something to do with the content of histories.txt.

Ok, then what is the content?  Presumably there are some wide-chars and/or non-ascii chars.

Why does the "(s" look weird?  What's in the file.  It looks like everything is fine.  Both files are identical.  You just have different versions of `file` running on (I suspect) different distros with different configurations.

Is there an actual problem here or were you just curious about the different `file` output?


"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#6 2016-06-08 11:20:35

gladixy
Member
Registered: 2014-03-12
Posts: 15

Re: [Solved] Copying file over ssh -> Encoding mixed up

Regarding "(s". Just looked a bit random to me.

Yes this issue represents an actual problem for me.

Just found some non-ascii chars in histories.txt. Strongly assuming that these are the culprit.

Last edited by gladixy (2016-06-08 11:21:12)

Offline

#7 2016-06-08 12:54:29

gladixy
Member
Registered: 2014-03-12
Posts: 15

Re: [Solved] Copying file over ssh -> Encoding mixed up

Removed the non-ASCII chars. Still no luck sad

In case somebody is interested, I uploaded the file: http://s000.tinyupload.com/index.php?fi … 6721316111

Offline

#8 2016-06-08 12:55:44

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 29,523
Website

Re: [Solved] Copying file over ssh -> Encoding mixed up

That is probably not a good filesharing site to use.  It "converts" your file between the upload and download.  And in this case it just completely failed and I just get an error trying to download anyways.  (edit: got it now)

I'm not sure why the "(s" seemed odd to you, those are the first two characters in the file.  Still, is there a problem here, or what is your actual question?


"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#9 2016-06-08 14:30:12

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: [Solved] Copying file over ssh -> Encoding mixed up

gladixy wrote:

Both, md5sum and xdd give identical results.

Then your problem is not the ssh transfer, it is a problem with the versions of the programs you use in the server and your machine, just like Trilby has mentioned before.

You might be able to force your text editor to open the file as a text file, although it might be inconvenient if you have to open it often. On the other hand, if we are talking about cli utils then they don't (or shouldn't) care about the detected type of file.


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

#10 2016-06-08 22:08:29

gladixy
Member
Registered: 2014-03-12
Posts: 15

Re: [Solved] Copying file over ssh -> Encoding mixed up

Actually you were right. My problem was solemnly connected to the non-ASCII chars. It's working now despite

file -i histories.txt

reporting a binary file.

Offline

Board footer

Powered by FluxBB