You are not logged in.

#1 2011-08-30 04:11:39

B-80
Member
Registered: 2010-05-05
Posts: 47

Script to "print" ebook from website

I posted this on stackoverflow but got no love and no help, I figured I may try here as well.


I bought an ebook for a class I am in, however, I really hate how the site is laid out and it makes it a pain to read. So I'd like to put the pages into some sort of text document/pdf.

Basically, I don't know the best way to go about pulling pages from the site. You have to be logged in to access the book, and I really have no idea how they authenticate(and I don't know if they'd appreciate me trying to mess around with that), so I need to write something that will be a macro for my browser, or I need to write a script that can control my browser after I am logged in to the site. I'd also like to point out that what I am trying to do is perfectly within the terms of service, I am allowed to print any pages I want, I am just trying to automate that.

So anyone have any idea on how I can go about this? I am essentially trying to replicate something like wget -m (or it can simply pull all text/images off of a frame whose name I supply), but it needs to be done through my browser so that I can be authenticated throughout the process. Then I can go in and parse the html in python or C or whatever I choose.

Offline

#2 2011-08-30 04:28:53

ewaller
Administrator
From: Pasadena, CA
Registered: 2009-07-13
Posts: 20,324

Re: Script to "print" ebook from website

Moderator note:

I believe this thread runs afoul of the legality (copyright) provisions detailed in the forum etiquette

I am closing this thread.  If I am not correct, please contact me with information related to your rights to use their material in this manner and I will reopen this thread.

ewaller


Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way

Offline

Board footer

Powered by FluxBB