You are not logged in.

#26 2010-07-22 13:30:41

skodabenz
Banned
From: Tamilnadu, India
Registered: 2010-04-11
Posts: 382

Re: Downloadable Arch Wiki

Any way to modify the PKGBUILD for English (or any particular language). I live in India and it is useless for me to download all the European and other language pages (which makeup considerable download time) using the PKGBUILD.


My new forum user/nick name is "the.ridikulus.rat" .

Offline

#27 2010-07-22 13:34:22

karol
Archivist
Registered: 2009-05-06
Posts: 25,440

Re: Downloadable Arch Wiki

Are we talking about the http://www.archlinux.org/packages/commu … wiki-docs/ package?
usr/share/doc/arch-wiki/html/00002302.html -> what language is that?

You need to modify some other files
http://repos.archlinux.org/wsvn/communi … ocs/trunk/

Last edited by karol (2010-07-22 13:36:35)

Offline

#28 2010-07-22 13:54:54

karol
Archivist
Registered: 2009-05-06
Posts: 25,440

Re: Downloadable Arch Wiki

Does anybody know why those pages aren't wget'ed with '--convert-links' for offline viewing?

Offline

#29 2010-07-22 14:17:45

skodabenz
Banned
From: Tamilnadu, India
Registered: 2010-04-11
Posts: 382

Re: Downloadable Arch Wiki

karol wrote:

Are we talking about the http://www.archlinux.org/packages/commu … wiki-docs/ package?
usr/share/doc/arch-wiki/html/00002302.html -> what language is that?

You need to modify some other files
http://repos.archlinux.org/wsvn/communi … ocs/trunk/

I am talking about pages like these

/usr/share/doc/arch-wiki/html/00009596.html
/usr/share/doc/arch-wiki/html/00007888.html
/usr/share/doc/arch-wiki/html/00004796.html

etc.

These pages are from the community/arch-wiki-docs 20100621-1 package.


My new forum user/nick name is "the.ridikulus.rat" .

Offline

#30 2010-07-22 14:20:32

karol
Archivist
Registered: 2009-05-06
Posts: 25,440

Re: Downloadable Arch Wiki

Try looking inside the <title> tags for parenthesis

<title>ACL (Русский) - ArchWiki</title>

and remove those pages that have one (well, two actually tongue).

Edit: Actually at least some pages have '(en)' so you may want to keep them.

Last edited by karol (2010-07-22 14:25:48)

Offline

#31 2010-07-22 14:33:28

skodabenz
Banned
From: Tamilnadu, India
Registered: 2010-04-11
Posts: 382

Re: Downloadable Arch Wiki

How about adding commands in the PKGBUILD (or other relevant files) to prevent makepkg from not downloading these pages in the first place? Something like Language detection of page based on title before downloading.

or maybe archwiki should have a list of pages (along with the links) arranged based on language wherein arch-wiki-docs first downloads the list and then downloads the actual pages.


My new forum user/nick name is "the.ridikulus.rat" .

Offline

#32 2010-07-22 14:37:34

karol
Archivist
Registered: 2009-05-06
Posts: 25,440

Re: Downloadable Arch Wiki

skodabenz wrote:

How about adding commands in the PKGBUILD (or other relevant files) to prevent makepkg from not downloading these pages in the first place? Something like Language detection of page based on title before downloading.

I'm not sure if grabbing the pages by yourself won't hurt the server - be sure to get a permission before you do it.

skodabenz wrote:

or maybe archwiki should have a list of pages (along with the links) arranged based on language wherein arch-wiki-docs first downloads the list and then downloads the actual pages.

Those things are in the works + thre's http://bugs.archlinux.org/task/17580

Last edited by karol (2010-07-22 14:38:42)

Offline

#33 2010-07-22 14:41:57

skodabenz
Banned
From: Tamilnadu, India
Registered: 2010-04-11
Posts: 382

Re: Downloadable Arch Wiki

Maybe I have to wait. Thanks for the link.


My new forum user/nick name is "the.ridikulus.rat" .

Offline

#34 2010-07-22 14:53:54

fturco
Member
Registered: 2010-07-12
Posts: 40

Re: Downloadable Arch Wiki

karol wrote:

@ fturco
If you're serous about it, you may want to post your proposal on the ML - that's where all the cool Arch High Priests hang out.

I sent an e-mail to arch-general@archlinux.org. Thank you for the hint.

Offline

#35 2010-08-06 07:47:46

canolucas
Member
Registered: 2010-05-23
Posts: 52

Re: Downloadable Arch Wiki

nixpunk wrote:
milomouse wrote:

I wonder how often the official package is updated to reflect wiki changes.

http://www.archlinux.org/packages/commu … wiki-docs/
Last Updated:      2010-04-14

a script to update it every month sounds like a good idea here.. i think that wont hurt bandwith.. and would help maintain a less-than or equal-to 1 month old wiki docs..

Last edited by canolucas (2010-08-06 07:49:03)

Offline

#36 2010-08-06 08:00:13

Pierre
Developer
From: Bonn
Registered: 2004-07-05
Posts: 1,964
Website

Re: Downloadable Arch Wiki

I'll just quote my mail I sent to ftucro. Maybe this will show you guys some possible options.

So, after our conversation on irc I had a quick look at that dumpBackup
script that comes with MediaWiki. Using the --current switch it's quite
fast and the result is more or less small. So I think I could create
those snapshots (without history) regulary. I did a sample snaphots at
https://users.archlinux.de/~pierre/tmp/ … ent.xml.xz (5.8MB)

The challenge is now if those dumps are of any use. That's where you or
others have to start their research. E.g. have a look at
https://launchpad.net/wikipediadumpreader etc.

Greetings,

Pierre

Offline

#37 2010-08-06 11:10:20

canolucas
Member
Registered: 2010-05-23
Posts: 52

Re: Downloadable Arch Wiki

if it is in xml, then it is easy to parse it to whatever you want using perl's xml parser (or maybe another tool). you can easily generate HTML or plain text.
other option is to have archwiki's XML + something like wikipediadumpreader, to read the XML live, but i don't know why that sounds like bloat for me.. maybe something ncurses-based to read from console !

PS: very interesting challenge! wink

Last edited by canolucas (2010-08-06 11:52:02)

Offline

#38 2010-08-06 11:59:21

karol
Archivist
Registered: 2009-05-06
Posts: 25,440

Re: Downloadable Arch Wiki

@ Pierre
I'm unable to download the wikidump w/ wget, but it works if I use the firefox' built-in downloader.
Just curious what went wrong, do you block wget, do I need to set the user-agent?

Offline

#39 2010-08-06 13:21:54

Pierre
Developer
From: Bonn
Registered: 2004-07-05
Posts: 1,964
Website

Re: Downloadable Arch Wiki

@karol: No, wget is broken. It cannot handle wildcard certs. Disable that check or use something like curl.

On-topic: Have also a look at http://meta.wikimedia.org/wiki/Data_dumps Fro my side I could create such a dump regulary. I could even maintain a simple package so we get it distributed over the mirrors.

Now its up to you to work on a working concept. :-)

Offline

#40 2010-08-22 15:14:59

Spip
Member
From: USA
Registered: 2010-07-29
Posts: 28

Re: Downloadable Arch Wiki

Hi,

I read some comments here of people who wants snapshots with a specific language.

It is possible now smile thanks to ArchDocumentalist

I wrote a perl script (based on the idea of arch-wiki-docs package) for that.
The git repository is here: http://github.com/sciunto/archdocumentalist/
The aur package there: http://aur.archlinux.org/packages.php?ID=40125

The aur package install the script in /usr/bin. So you can generate a documentation like this:
archdocumentalist.pl EN /tmp
or
archdocumentalist.pl FR /tmp #French

/tmp/arch-wiki-EN
/tmp/arch-wiki-FR
are created with an index.html and pages in English/French.

Or course, if you want another language, just change the command.


I will contact sooner Aaron Griffin and Sergej Pupykin (they both contribute on arch-wiki-docs: http://www.archlinux.org/packages/commu … iki-docs/) to suggest this new way.

Advantages:
less bandwith requests
less disk usage
it is faster to generate snapshots (for users who select few languages...)

A little issue exists. Some pages have parenthesis in their titles on the wiki and they are interpreted as a language. But I think there is only ten pages which are concerned and it could be corrected easily. Another way is to do some acrobatic checks in the script, but I'm not convinced it is useful.

it works fine at home, but if you encounter a bug or if you have an idea, feel free to open a ticket. Patches are accepted tongue

Last edited by Spip (2010-08-23 09:09:45)

Offline

#41 2010-08-23 09:18:48

Spip
Member
From: USA
Registered: 2010-07-29
Posts: 28

Re: Downloadable Arch Wiki

The usage has been slightly changed. The previous post is up to date. wink

Offline

#42 2010-08-23 09:48:21

Pierre
Developer
From: Bonn
Registered: 2004-07-05
Posts: 1,964
Website

Re: Downloadable Arch Wiki

I would prefer using the dumps directly intead of crwaling the wiki. In addition to this translated wiki pages are meant to be in their local wikis connected with interwiki links.

Offline

#43 2010-08-23 11:00:32

Spip
Member
From: USA
Registered: 2010-07-29
Posts: 28

Re: Downloadable Arch Wiki

Pierre wrote:

I would prefer using the dumps directly intead of crwaling the wiki.

Yep, but something similar to http://www.archlinux.org/packages/commu … wiki-docs/ could be done (ie replace the previous scripts by archdocumentalist) to make one package per language (arch-wiki-docs-en...). If needed, I can add an option to generate again a complete snapshot (in the case you want to keep arch-wiki-docs) in order to maintain only one code.

Last edited by Spip (2010-08-23 11:01:29)

Offline

#44 2010-09-28 18:32:04

karol
Archivist
Registered: 2009-05-06
Posts: 25,440

Re: Downloadable Arch Wiki

Can someone please check if this script is safe for the wiki server (gudrun?)? I would suggest running it periodically for each language and uploading it somewhere in order to minimize the load on the server. This approach would be as fast and painless as the original downloadable wikis, which take seconds to land on my disc, not ... half an hour.

If the pages are fetched uncompressed (I don't know, haven't looked at the code), the bandwidth needed tor the new approach is over 30 MB for the English lang v. only 7.5 MB for the whole wiki if we do it the old way (because the arch-wiki-docs package you download is compressed).

Offline

#45 2010-09-28 18:38:29

Pierre
Developer
From: Bonn
Registered: 2004-07-05
Posts: 1,964
Website

Re: Downloadable Arch Wiki

I could only quote my post from above. So please don't use this script to download the whole wiki; it's also broken by design as the wiki should be english only. I'd be happy to provide regular snapshots.

Offline

#46 2010-09-28 18:47:16

karol
Archivist
Registered: 2009-05-06
Posts: 25,440

Re: Downloadable Arch Wiki

Pierre wrote:

I could only quote my post from above. So please don't use this script to download the whole wiki; it's also broken by design as the wiki should be english only. I'd be happy to provide regular snapshots.

I only posted because a month passed and there was no official reaction.
I can see a difference in stylesheets (the new approach keeps the Arch wiki look) but I don't really care.

Should the script be removed from AUR?

Offline

#47 2010-09-28 20:04:33

Spip
Member
From: USA
Registered: 2010-07-29
Posts: 28

Re: Downloadable Arch Wiki

Hi,

Why the wiki should be english only? should we delete all translated pages?

I wrote this script because some users suggested it. I have not yet contacted the original script maintainers since I received no feed back. The objective of archdocumentlist is to replace the old script, that is to say, to provide snapshots for users.

Moreover, I wrote some lines (not yet pushed on my git repository) to download a raw (mediawiki) version of the wiki. Someone asked for it and I think it is a receivable idea since a contributor could want a copy of his work...

I ignore why are you so reluctant toward my work, Pierre. sad If you think it is perfectible, I am opened to accept your patches.

Offline

#48 2010-09-28 20:13:40

karol
Archivist
Registered: 2009-05-06
Posts: 25,440

Re: Downloadable Arch Wiki

I think non-English speaking users are encouraged to start their own wiki and to migrate the contents from the main wiki, just as is the case with forums. We really have to clean this up one way or another, as getting half of the search results in the wiki in a foreign lang is not helpful.

No one can really tel, what have you written there if no one can speak your language. Outdated e.g. Spanish of Chinese articles aren't helpful either and yet, we're somehow responsible for the confusion they may create.

Offline

#49 2010-09-28 23:58:21

CPUnltd
Member
From: Milwaukee, WI
Registered: 2009-12-05
Posts: 483
Website

Re: Downloadable Arch Wiki

karol does pose a good point... I vote for whatever method is easiest and most efficient/accurate/convenient to the end user (as I have a vested interest, being an end user and all)... this may require compromise on all parts, but I think a discussion should be had on this specifically because this is a VERY valuable tool and I'd hate to see it disappear due to disagreements that couldn't be settled...


Help grow the dev population... have your tech trained and certified!

Offline

#50 2010-09-29 00:19:57

karol
Archivist
Registered: 2009-05-06
Posts: 25,440

Re: Downloadable Arch Wiki

We need some way to conveniently access the information.
http://aur.archlinux.org/packages.php?ID=29614 may be an overkill, putting all the info on one big page also.

Offline

Board footer

Powered by FluxBB