You are not logged in.

#1 2005-04-27 05:45:37

aikidoist72
Member
From: Australia
Registered: 2005-04-15
Posts: 63

Wiki html page

I would like to download the entire Wiki index and associated pages to a depth of 3 so I can put it on cd and install Arch on a remote computer.  I have the current and extra caches synced and ready to go.  I tried this using wget with all the correct parameters, but 3 1/2 hours later I stopped it.  Well over 200mb!!!  Is there a Wiki package somewhere I can download instead?


Sitting quietly
Doing nothing
The grass grows
And the flowers bloom
All by themselves

Offline

#2 2005-04-27 06:46:12

cactus
Taco Eater
From: t͈̫̹ͨa͖͕͎̱͈ͨ͆ć̥̖̝o̫̫̼s͈̭̱̞͍̃!̰
Registered: 2004-05-25
Posts: 4,615
Website

Re: Wiki html page

I have a wiki extractor that *kinduv* works. The html it generates is a bit wonky, as I haven't fully snagged the phpwiki parser out of a more recent version of phpwiki. The one I snagged out of was 1.2x or something..

It is a python/php hybrid script, and it take a while to run..I have a generated html dir if you would like to have some pages that are snapshotted from a few months ago, when Judd sent me a snapshot of the wiki gdbm..

Like I said, the pages are a bit ugly, but it is better than a poke in the eye..

in total, I think it is about 2.8MB worth..


"Be conservative in what you send; be liberal in what you accept." -- Postel's Law
"tacos" -- Cactus' Law
"t̥͍͎̪̪͗a̴̻̩͈͚ͨc̠o̩̙͈ͫͅs͙͎̙͊ ͔͇̫̜t͎̳̀a̜̞̗ͩc̗͍͚o̲̯̿s̖̣̤̙͌ ̖̜̈ț̰̫͓ạ̪͖̳c̲͎͕̰̯̃̈o͉ͅs̪ͪ ̜̻̖̜͕" -- -̖͚̫̙̓-̺̠͇ͤ̃ ̜̪̜ͯZ͔̗̭̞ͪA̝͈̙͖̩L͉̠̺͓G̙̞̦͖O̳̗͍

Offline

#3 2005-04-27 10:33:10

aikidoist72
Member
From: Australia
Registered: 2005-04-15
Posts: 63

Re: Wiki html page

G´day Cactus,

Thank´s mate, I will take up your offer if the command below fail miserably.

wget -r -l 2 -k -p http://wiki2.archlinux.org/index.php/WikiIndex

I have downloaded sections of websites before, but I am struggling to understand why this is different.  Does the .php have something to do with it?  Just out of interest, how do we org the transfer without posting my email address?  :oops:  Sorry - still a noob!!!

Cheers


Sitting quietly
Doing nothing
The grass grows
And the flowers bloom
All by themselves

Offline

#4 2005-04-27 18:37:49

cactus
Taco Eater
From: t͈̫̹ͨa͖͕͎̱͈ͨ͆ć̥̖̝o̫̫̼s͈̭̱̞͍̃!̰
Registered: 2004-05-25
Posts: 4,615
Website

Re: Wiki html page

I would just pm you the location of the file on my web server.  wink

Anyway, I assume that the problem is twofold. The first fold is that many wiki pages link offsite. Not too bad, you just tell wget not to go offsite..
the real problem is that a wikiword that is not defined, is still a link. It links to a page saying (please define me). On that page, are several self links..all going to "please define me" pages.

I bet it would snag quite a few "please define me" pages.


"Be conservative in what you send; be liberal in what you accept." -- Postel's Law
"tacos" -- Cactus' Law
"t̥͍͎̪̪͗a̴̻̩͈͚ͨc̠o̩̙͈ͫͅs͙͎̙͊ ͔͇̫̜t͎̳̀a̜̞̗ͩc̗͍͚o̲̯̿s̖̣̤̙͌ ̖̜̈ț̰̫͓ạ̪͖̳c̲͎͕̰̯̃̈o͉ͅs̪ͪ ̜̻̖̜͕" -- -̖͚̫̙̓-̺̠͇ͤ̃ ̜̪̜ͯZ͔̗̭̞ͪA̝͈̙͖̩L͉̠̺͓G̙̞̦͖O̳̗͍

Offline

#5 2005-04-27 23:01:21

aikidoist72
Member
From: Australia
Registered: 2005-04-15
Posts: 63

Re: Wiki html page

Ok, the command above has worked to an extent.  I have a wiki folder that is 19.6Mb and 1976 files.  I tried doing a search on ¨cactus¨ and was sent to the Archlinux website.  All searches send me to a website.

This is a bummer, but on the whole I am quite happy with the outcome.  I can keep clicking through linked pages within the wiki, and have all the info I will require.   big_smile

Thanks for your help cactus.  I am not a whizz with web pages, but if I can refine this lot to a reasonable size, I will zip it and send it to you for your site if you wish.


Cheers


Sitting quietly
Doing nothing
The grass grows
And the flowers bloom
All by themselves

Offline

#6 2005-04-27 23:14:27

cactus
Taco Eater
From: t͈̫̹ͨa͖͕͎̱͈ͨ͆ć̥̖̝o̫̫̼s͈̭̱̞͍̃!̰
Registered: 2004-05-25
Posts: 4,615
Website

Re: Wiki html page

sounds good. thanks.


"Be conservative in what you send; be liberal in what you accept." -- Postel's Law
"tacos" -- Cactus' Law
"t̥͍͎̪̪͗a̴̻̩͈͚ͨc̠o̩̙͈ͫͅs͙͎̙͊ ͔͇̫̜t͎̳̀a̜̞̗ͩc̗͍͚o̲̯̿s̖̣̤̙͌ ̖̜̈ț̰̫͓ạ̪͖̳c̲͎͕̰̯̃̈o͉ͅs̪ͪ ̜̻̖̜͕" -- -̖͚̫̙̓-̺̠͇ͤ̃ ̜̪̜ͯZ͔̗̭̞ͪA̝͈̙͖̩L͉̠̺͓G̙̞̦͖O̳̗͍

Offline

Board footer

Powered by FluxBB