You are not logged in.

#1 2011-03-01 10:32:20

toad
Member
From: if only I knew
Registered: 2008-12-22
Posts: 1,775
Website

[SOLVED] - get number of keystrokes of multiple documents on CLI

Hi,

Situation:
I have a directory with about 140 soft links to docs from a variety of sources.

Goal:
input a bash command get no. of combined keystrokes from all docs in this directory

What I have done:
Went to the directory, did a "wc -m *" and got some strange readings. Here an example:


22079 001_Bacalhau_Gomes_RM_inglese.doc - but if I enter the document itself I get 2370 characters, not 22079!?

Does wc not work on docs? If not, how can I batch convert these links to txt - is it at all possible?

EDIT:
I converted one doc into a txt file manually and got a more realistic reading:
doc - 2370
txt  - 2404

Last edited by toad (2011-03-01 11:17:00)


never trust a toad...
::Grateful ArchDonor::
::Grateful Wikipedia Donor::

Offline

#2 2011-03-01 11:07:30

Mr.Elendig
#archlinux@freenode channel op
From: The intertubes
Registered: 2004-11-07
Posts: 4,092

Re: [SOLVED] - get number of keystrokes of multiple documents on CLI

.doc is quite a "heavy" format with loads of metadata. In fact, if you just open and close a .doc in ms word, without saving, the size will increase because it will add the access to the metadata. (depends on the version of word)


Evil #archlinux@libera.chat channel op and general support dude.
. files on github, Screenshots, Random pics and the rest

Offline

#3 2011-03-01 11:16:10

toad
Member
From: if only I knew
Registered: 2008-12-22
Posts: 1,775
Website

Re: [SOLVED] - get number of keystrokes of multiple documents on CLI

Thanks, Mr. Elendig.

Solved this smile

1. install abiword
2. enter directory
3. abiword --to=text *
4. all done, I can now run wc on the txt files


never trust a toad...
::Grateful ArchDonor::
::Grateful Wikipedia Donor::

Offline

Board footer

Powered by FluxBB