Converting HTML to PDF

Falcata · 2008-04-23 22:20:53

Is there any way to convert a bunch of html pages into a PDF? Do any of them work with pages spread out into multiple subdirectories? And is there a way to specify which page is set as the first page of the PDF?

thayer · 2008-04-24 00:35:58

Check out the CUPS article in the wiki. You can install a virtual printer which will print whatever you like to a PDF document.

Falcata · 2008-04-24 01:10:48

Are there any other solutions? I get the idea that with the Virtual Printers I would have to specify each file I wanted to include.

Xinix · 2008-04-24 08:19:20

If you use KDE, you could open the file in Konqueror, and then print to PDF directly via the KDE print system

EDIT
I now see that you wish to bundle multiple pages into one PDF. I am not sure if you can append to an existing PDF, I am afraid not. Maybe PDFedit would allow that

Last edited by Xinix (2008-04-24 08:20:35)

Stefan Husmann · 2008-04-24 08:47:01

I think there is no easy solution here. HTML pages tend not to fit on papersizes. Maybe html2latex helps, but do not expect an out-of-the-box-solution.

torin_dan · 2008-04-24 09:41:54

htmldoc from community

However it only accepts plain old html (as far as i remember HTML3 or a subset of HTML4).
So if you have js, css, and all modern stuff, the result will be far from what you used to see in any modern browser.

Falcata · 2008-04-24 12:49:36

Yeah, I tried htmldoc with both --webpage and --book, and it gave me a corrupt pdf file. On the command line, it gave me some message about a "bad header" or something like that. I'm going to look into html2latex, and see if that offers a solution. If all else fails, I might just write a script to do the conversion. That, and ask Zim's developer to add a PDF output feature to the program.

Falcata · 2008-04-24 16:27:09

I tried htmldoc again, and had a little luck. I've got it creating a PDF from the html files in the top directory, and the html files in the directories immediately below (via wildcards), but I can't get it to use the html files from any of the lower directories. Is there any way around this, other than spelling out each and every subdirectory?

Jack B · 2008-04-24 20:14:22

I don't know, but I have two ideas which might work. Firstly, can you use find/locate and xargs to find all the html files and feed them into htmldoc? This probably only works if htmldoc can take multiple files in one go.

Otherwise a simple bash script along the lines of:

#!/bin/bash
for i in `locate .html`; do
    htmldoc $i
done

might work?

Just ideas
Jack

Falcata · 2008-04-24 20:37:46

Well, I wrote an overly-complicated ruby script to do something similar, and it worked. Unfortunately, htmldoc wasn't able to handle the special characters I was using in the documents.

EDIT: Zim's alternate output format is txt2tags. Would that be any easier to convert to PDF?

Last edited by Falcata (2008-04-24 20:46:01)

mclang · 2008-04-26 12:18:39

May be overly complicated but how about FOP:
http://xmlgraphics.apache.org/fop/

TheSaint · 2008-05-16 13:35:29

Xinix wrote:

If you use KDE, you could open the file in Konqueror, and then print to PDF directly via the KDE print system

If you have Acrobat, Firefox and some other program you might not able to use KDE print System.
The output goes directly to CUPS server.

F

Arch Linux

#1 2008-04-23 22:20:53

Converting HTML to PDF

#2 2008-04-24 00:35:58

Re: Converting HTML to PDF

#3 2008-04-24 01:10:48

Re: Converting HTML to PDF

#4 2008-04-24 08:19:20

Re: Converting HTML to PDF

#5 2008-04-24 08:47:01

Re: Converting HTML to PDF

#6 2008-04-24 09:41:54

Re: Converting HTML to PDF

#7 2008-04-24 12:49:36

Re: Converting HTML to PDF

#8 2008-04-24 16:27:09

Re: Converting HTML to PDF

#9 2008-04-24 20:14:22

Re: Converting HTML to PDF

#10 2008-04-24 20:37:46

Re: Converting HTML to PDF

#11 2008-04-26 12:18:39

Re: Converting HTML to PDF

#12 2008-05-16 13:35:29

Re: Converting HTML to PDF

Board footer