You are not logged in.

#1 2021-10-03 11:47:17

Togop
Member
Registered: 2018-10-01
Posts: 8

[Solved] Lesspipe not working

I'm trying to open pdf, docx, xlsx and other files on command line. I have lesspipe installed, but it fails with similar errors:

lesspipe.sh file.pdf
==> append : to filename to view the PDF source
usage: html2text [-h] [--default-image-alt DEFAULT_IMAGE_ALT] [--pad-tables] [--no-wrap-links] [--wrap-list-items] [--ignore-emphasis]
                 [--reference-links] [--ignore-links] [--protect-links] [--ignore-images] [--images-as-html] [--images-to-alt]
                 [--images-with-size] [-g] [-d] [-e] [-b BODY_WIDTH] [-i LIST_INDENT] [-s] [--escape-all] [--bypass-tables] [--ignore-tables]
                 [--single-line-break] [--unicode-snob] [--no-automatic-links] [--no-skip-internal-links] [--links-after-para] [--mark-code]
                 [--decode-errors DECODE_ERRORS] [--open-quote OPEN_QUOTE] [--close-quote CLOSE_QUOTE] [--version]
                 [filename] [encoding]
html2text: error: unrecognized arguments: -from_encoding

lesspipe.sh file.xlsx
==> append : to filename to view the raw word document
usage: html2text [-h] [--default-image-alt DEFAULT_IMAGE_ALT] [--pad-tables] [--no-wrap-links] [--wrap-list-items] [--ignore-emphasis]
                 [--reference-links] [--ignore-links] [--protect-links] [--ignore-images] [--images-as-html] [--images-to-alt]
                 [--images-with-size] [-g] [-d] [-e] [-b BODY_WIDTH] [-i LIST_INDENT] [-s] [--escape-all] [--bypass-tables] [--ignore-tables]
                 [--single-line-break] [--unicode-snob] [--no-automatic-links] [--no-skip-internal-links] [--links-after-para] [--mark-code]
                 [--decode-errors DECODE_ERRORS] [--open-quote OPEN_QUOTE] [--close-quote CLOSE_QUOTE] [--version]
                 [filename] [encoding]
html2text: error: unrecognized arguments: -from_encoding

The above is the most frequent case; for some files it simply prints nothing, but so I haven't successfully opened a non-text file. I do have libreoffice and pandoc installed, and I don't know what the problem can be.

Edit: This seems to be a bug in lesspipe; I was able to fix it by editing the script.

Last edited by Togop (2021-10-03 21:08:08)

Offline

#2 2021-10-03 12:42:15

lmn
Member
Registered: 2021-05-09
Posts: 88
Website

Re: [Solved] Lesspipe not working

I have not used lesspipe myself, but reading the README and man page suggest that lesspipe enables less to view other file formats i.e. try `less file.pdf`.
Otherwise could you share the output of

echo  $LESSOPEN
which lesspipe.sh

Have you sourced your shellrc / restarted your shell after changing the environment?

Also always provide what you have done as otherwise the only replys you get are guesswork at best.

Edit: fixed link

Last edited by lmn (2021-10-03 12:52:09)

Offline

#3 2021-10-03 20:41:07

Togop
Member
Registered: 2018-10-01
Posts: 8

Re: [Solved] Lesspipe not working

Well, less simply calls lesspipe.sh to decode the file, so I get the exact same error. It just gets printed in the less screen.

What I've done is to install the lesspipe package, make sure I have listed dependencies for the file formats I'm interested in (libreoffice-fresh, pandoc, pfdtotext, pdftohtml), and tried to read some files. I got the error instead of the file contents.

Offline

#4 2021-10-04 04:19:05

amaro
Member
From: xfce
Registered: 2014-05-09
Posts: 367

Re: [Solved] Lesspipe not working

Togop wrote:

Edit: This seems to be a bug in lesspipe; I was able to fix it by editing the script.

Same problem here. Would you share what exactly (which file and which lines) needs editing?

Offline

#5 2021-10-04 13:37:07

lmn
Member
Registered: 2021-05-09
Posts: 88
Website

Re: [Solved] Lesspipe not working

I can not speak for Togop but I have tested lesspipe myself and can confirm this behavior.

The problematic lines seem to be in the `parsehtml` function in `/usr/bin/lesspipe.sh` one example

html2text -utf8 2>/dev/null ||  html2text -from_encoding utf-8

so the problem arises from the nonexistent `-utf8` option.

There are several differing implementations for html2text.
[1] https://github.com/Alir3z4/html2text
[2] http://www.mbayer.de/html2text
There are some more.

[1] is the implementation packaged in the repos that provide `html2text` and [2] is another one explicitly mentioned in the README.
[2] does provide the `-from_encoding` option.
All of these are incompatible as they have different flags/syntax and lesspipe.sh is trying to accommodate at least 2 different ones (I cant say which)

When lesspipe does not detect `html2text` on the system it falls back on another way of parsing files. I managed to get pdfs working by uninstalling `python-html2text`.
We should clarify whether this is an packaging error in the sense that this optional dependency does not provide for this functionality or if this is an upstream bug.

Also there is already a Github issue for this.

PS:
I could fix it by essentially separating the dash from utf8 and make it adhere to the syntax of the packaged `html2text`

html2text - utf8 2>/dev/null ||  html2text -from_encoding utf-8

This was just for testing please avoid meddling with packaged files.

Edit: added warning

Last edited by lmn (2021-10-04 13:44:11)

Offline

#6 2021-10-05 14:33:50

amaro
Member
From: xfce
Registered: 2014-05-09
Posts: 367

Re: [Solved] Lesspipe not working

Thank you, lmn!

I could not uninstall 'python-html2text' cause it's needed by calibre. I actually do not use calibre but keep it cause sometimes I use its 'ebook-convert' option.

Anyway, I edited the '/usr/bin/lesspipe.sh' file with your suggestion and got it to work.

Offline

Board footer

Powered by FluxBB