You are not logged in.

#1 2011-03-03 22:39:52

knob
Member
Registered: 2009-01-30
Posts: 48

When it comes to binary files ...

I can't say I have a good grasp of the real differences between files written in text mode and files written in binary mode.

For (C) example:

struct data
{
    int a, b;
} d;

d.a = 432;
d.b = 581924;

char s[14] = "hello world!\n";

FILE *fd = fopen("test.txt", "wt");

fwrite(&d, sizeof(data), 1, fd);
fwrite(s, sizeof(char), 14, fd);

fclose(fd);

fd = fopen ("test.bin", "wb");

fwrite(&d, sizeof(data), 1, fd);
fwrite(s, sizeof(char), 14, fd);

fclose(fd);    

test.txt hexdump:

B0 01 00 00 24 E1 08 00 68 65 6C 6C 6F 20 77 6F 72 6C 64 21 0D 0A 00

test.bin hexdump:

B0 01 00 00 24 E1 08 00 68 65 6C 6C 6F 20 77 6F 72 6C 64 21 0A 00

What I know (and see):

  • strings containing '\n', when are written in text mode to a file, have the '\n' replaced by CR+LF / LF / CR depending on the platform; in binary mode it is left unchanged

  • binary files are used to represent the image of the memory used by a program; so if a chain of structures are wrritten in a known order in a binary file, that chain can be restored on the next startup of the program, if the reading takes place in the order specified so no parsing and type casting is needed

Am I right? Other than these things what else should I know?

The reason I want to clear these things out is because I have a file which contains the data collected and interpreted by a counter device from some sensors.
I believe this file is the binary footprint of the memory used to hold the recordings because I can't decipher the hexdump.

Now, I have a program to read this type of file and see the recordings and export them to a file in a human readable format (and no, it doesn't have a command line argument for this so that I can make a script). The thing is, I want to put all the recordings in a database and manipulate them further for all kinds of statistics and I don't want to waste time exporting whenever there is a data to be inserted.

I have no clue if what I'm saying has a bit of truth so I would be grateful if someone can correct or enlighten me on how I should approach this  ... is there a way to "parse" the binary file directly not knowing the types of the placeholders for the recordings?

Last edited by knob (2011-03-03 22:41:04)

Offline

#2 2011-03-03 22:46:26

ngoonee
Forum Fellow
From: Between Thailand and Singapore
Registered: 2009-03-17
Posts: 7,358

Re: When it comes to binary files ...

knob wrote:

Now, I have a program to read this type of file and see the recordings and export them to a file in a human readable format (and no, it doesn't have a command line argument for this so that I can make a script). The thing is, I want to put all the recordings in a database and manipulate them further for all kinds of statistics and I don't want to waste time exporting whenever there is a data to be inserted.

There are literally a million ways to produce the binary format, you'd need a translator for the specific format you're using. Interpreting binary yourself isn't really possible without either knowing the format or testing over and over with known inputs.


Allan-Volunteer on the (topic being discussed) mailn lists. You never get the people who matters attention on the forums.
jasonwryan-Installing Arch is a measure of your literacy. Maintaining Arch is a measure of your diligence. Contributing to Arch is a measure of your competence.
Griemak-Bleeding edge, not bleeding flat. Edge denotes falls will occur from time to time. Bring your own parachute.

Offline

#3 2011-03-03 23:05:06

knob
Member
Registered: 2009-01-30
Posts: 48

Re: When it comes to binary files ...

Yeah, inspecting the program's GUI and looking over the binary code to start making input assumptions/testing is like looking for a needle in a haystack.
I was thinking that there might be a standard, easier way of doing this which I'm not aware of.

As for the differences between the 2 modes, you have something to add/correct?

I appreciate your input. tongue

Offline

#4 2011-03-04 01:51:33

ngoonee
Forum Fellow
From: Between Thailand and Singapore
Registered: 2009-03-17
Posts: 7,358

Re: When it comes to binary files ...

knob wrote:

Yeah, inspecting the program's GUI and looking over the binary code to start making input assumptions/testing is like looking for a needle in a haystack.
I was thinking that there might be a standard, easier way of doing this which I'm not aware of.

As for the differences between the 2 modes, you have something to add/correct?

I appreciate your input. tongue

You simply do not understand, and you do not understand because you have not searched around for information. The forums aren't meant to be a place for asking open-ended questions which require your own study.

Besides, noone could possibly give you an answer because your question is all wrong. You do not know the format (binary just means its a collection of bits) nor the compression.


Allan-Volunteer on the (topic being discussed) mailn lists. You never get the people who matters attention on the forums.
jasonwryan-Installing Arch is a measure of your literacy. Maintaining Arch is a measure of your diligence. Contributing to Arch is a measure of your competence.
Griemak-Bleeding edge, not bleeding flat. Edge denotes falls will occur from time to time. Bring your own parachute.

Offline

#5 2011-03-04 17:30:44

stqn
Member
Registered: 2010-03-19
Posts: 1,191
Website

Re: When it comes to binary files ...

You're right about what happens when a "text" file is written to disk. However, files ("text" or "binary") can contain anything, not necessarily a "memory footprint". (Edit: well, technically, anything you write to disk comes from memory... but it doesn't have to come from a struct or an array. It doesn't matter anyway...)

That said, as far as I understand it, your problem is that your "counter device" uses an undocumented file format and closed-source software. First step would be to ask the developer for documentation and source code.

If the developer doesn't want to give the required information or update the software as needed, you'll have to reverse-engineer the file format, either as previously suggested by "guessing" the contents, or by decompiling the executables that access the files.

Last edited by stqn (2011-03-04 17:36:59)

Offline

Board footer

Powered by FluxBB