You are not logged in.

#76 2007-03-29 01:03:06

marxav
Member
From: Gatineau, PQ, Canada
Registered: 2006-09-24
Posts: 386

Re: Throttling ftp.archlinux.org

I only have 6 computer, but in my rsync script, I have an exclude file to limit the stuff that I know I don't need, and I try to fine tune it with time. 

Just my two cents.

Offline

#77 2007-03-29 01:16:59

crouse
Arch Linux f@h Team Member
From: Iowa - USA
Registered: 2006-08-19
Posts: 907
Website

Re: Throttling ftp.archlinux.org

ok.... I started with the script HERE : http://wiki.archlinux.org/index.php/Local_Mirror

It USES archlinux.org ...... so it needs updated I would assume. wink
I tried to mirror ftp-linux.cc.gatech.edu/pub/linux/distributions/archlinux/extra/ and current.... but i keep getting errors.... i'm assuming it's a pebkac error on my part wink 

My original line looked like this:
rsync -avz --delete rsync://rsync.archlinux.org/current/ /var/www/archlinux/mirror/current/ >> /tmp/rsync.log

so.. i've tried several other ways with ftp-linux.cc.gatech.edu/pub/linux/distributions/archlinux/extra/ and current no luck..... ideas ????

Offline

#78 2007-03-29 06:19:01

crouse
Arch Linux f@h Team Member
From: Iowa - USA
Registered: 2006-08-19
Posts: 907
Website

Re: Throttling ftp.archlinux.org

I've tried:

mirror.cs.vt.edu
gtlib.gatech.edu
ibiblio.org
ftp.nluug.nl

Either I'm doing something wrong (possible) or rsyncing to these isn't possible..... has ANYONE successfully mirrored a mirror ????? I suppose I could take my mirror down, but then all 50-100 computers then just hammer some OTHER mirror, which again defeats the purpose of running my own mirror.... taking the load off of the main server and the mirrors. (That and my private mirror is faster usually when updating)  Ideas ???

EDIT: The wiki shows 2 rsyncable mirrors: (I missed those on the first look at the wiki, i was looking for USA mirrors to rsync to)
http://wiki.archlinux.org/index.php/Mirrors
ftp://distrib-coffee.ipsl.jussieu.fr/pu … archlinux/ http [rsync://distrib-coffee.ipsl.jussieu.fr/pub/linux/archlinux/ rsync]
Tested with: rsync -vaunt --progress --stats --delete rsync://distrib-coffee.ipsl.jussieu.fr/pub/linux/archlinux/current/* Worked
# ftp://mir1.archlinuxfr.org/archlinux http [rsync://mir1.archlinuxfr.org/archlinux rsync]

is that all that are available to rsync non-official mirrors ???

Cerebral wrote:

A better structure, which I think we're slowly working towards, is the master server feeds out packages to the official mirrors through rsync, and blocks anyone who's not recognized as an official mirror.  Then other people rsync from the geographically closest 'secondary' mirror that supports rsync'ing.

This is a GREAT idea...... now.... if we could find more than 2 mirrors that support rsync'ing, I'd be a happy camper, as it stands, those 2 mirrors are on the other side of the planet from me wink   If there are 500+ rsync's that were just shut off.... I wonder how long before those 2 mirrors are overrun by people switching as well..........

Last edited by crouse (2007-03-29 07:03:30)

Offline

#79 2007-03-29 10:55:37

vanel86
Member
From: Trieste, Italy
Registered: 2007-03-29
Posts: 4

Re: Throttling ftp.archlinux.org

Up till now, all solutions proposed has been un-kiss and require heavy code knowledge. I think a long term solution to this issue is to modify pacman config file structure into two files:
A. pacman.conf

contains the standard repository abilitation in numerical format
1 for current, 2 for extra, 3 for community, 4 for testing, 5 for unstable
if i want extra and current i simply type 1,2

contains the geological location of the terminal in CONTINENT.NATION format
if I am in italy it will look like EUROPE.ITALY

custom repositories in standard format

B. mirrors.conf
the list of mirrors(a start can be taken from the wiki), stored locally, present on the mirrors root and synched on connection in CONTINENT.NATION format
for istance the EUROPE.ITALY will be

EUROPE.ITALY
http://mi.mirror.garr.it/mirrors/archlinux/

On the first pacman start, when it says "have you ever done a sync" or so, it will check the application locale and set on the pacman.conf file with a message "your language is X do you are in Y?" Y/N if NO is chosen, the user will type in his position in the standard CONTINENT.NATION and everybody's happy. I think this is a simplier solution to this problem, we leave ftp.archlinux.org out of the mirrors list and traffic is no longer done dy default by everybody.

Last edited by vanel86 (2007-03-29 11:36:53)

Offline

#80 2007-03-30 02:25:54

print
Member
Registered: 2007-02-27
Posts: 174

Re: Throttling ftp.archlinux.org

especially, a *working* sortmirror's script?  Last I heard, the one distributed with pacman had a tendency to empty some people's pacman.d/ files

Yeah, this happened to me. Shucks.


% whereis whatis whence which whoami whois who

Offline

#81 2007-03-30 02:28:21

Cerebral
Forum Fellow
From: Waterloo, ON, CA
Registered: 2005-04-08
Posts: 3,108
Website

Re: Throttling ftp.archlinux.org

print wrote:

especially, a *working* sortmirror's script?  Last I heard, the one distributed with pacman had a tendency to empty some people's pacman.d/ files

Yeah, this happened to me. Shucks.

pacman -S pacman should get them back, and bundled with pacman3 is the 'rankmirrors' script which works much better.

Offline

#82 2007-04-16 05:22:37

crouse
Arch Linux f@h Team Member
From: Iowa - USA
Registered: 2006-08-19
Posts: 907
Website

Re: Throttling ftp.archlinux.org

Ok...... now this is just getting annoying...... sad

We need a place that is stable to rsync other non-official repos too... I just noticed my logs for my rsync job , and my cron job wasn't working as it had been before.....so i ran it manually to see if i could find the issue:

sh archrsync.cron-backup3
rsync: failed to connect to mirror.cs.vt.edu: Connection refused
rsync error: error in socket IO (code 10) at clientserver.c(83)
rsync: failed to connect to mirror.cs.vt.edu: Connection refused
rsync error: error in socket IO (code 10) at clientserver.c(83)
rsync: failed to connect to mirror.cs.vt.edu: Connection refused
rsync error: error in socket IO (code 10) at clientserver.c(83)

I wonder if its just me, or if they shut off the rsync to that mirror. 

Suggestions ????? (A more stable mirror ????? )

Offline

#83 2007-04-16 05:30:47

toofishes
Developer
From: Chicago, IL
Registered: 2006-06-06
Posts: 602
Website

Re: Throttling ftp.archlinux.org

crouse wrote:

Ok...... now this is just getting annoying...... sad

We need a place that is stable to rsync other non-official repos too... I just noticed my logs for my rsync job , and my cron job wasn't working as it had been before.....so i ran it manually to see if i could find the issue:

sh archrsync.cron-backup3
rsync: failed to connect to mirror.cs.vt.edu: Connection refused
rsync error: error in socket IO (code 10) at clientserver.c(83)
rsync: failed to connect to mirror.cs.vt.edu: Connection refused
rsync error: error in socket IO (code 10) at clientserver.c(83)
rsync: failed to connect to mirror.cs.vt.edu: Connection refused
rsync error: error in socket IO (code 10) at clientserver.c(83)

I wonder if its just me, or if they shut off the rsync to that mirror. 

Suggestions ????? (A more stable mirror ????? )

This should be up soon, and looks to be a good mirror.
http://bbs.archlinux.org/viewtopic.php?id=31950

But then again, why are you rsyncing? There are VERY few people that actually should be doing this. Try a network shared pacman cache instead in most cases, look to the wiki for details.

Last edited by toofishes (2007-04-16 05:31:38)

Offline

#84 2007-04-16 05:34:49

crouse
Arch Linux f@h Team Member
From: Iowa - USA
Registered: 2006-08-19
Posts: 907
Website

Re: Throttling ftp.archlinux.org

Because I run a private mirror that i use as a repo for my 20+ machines and for some others .....I'm guessing between 50 - 100 machines total maybe. It was my way of taking stress off of the other mirrors and the main archlinux server.  The archlinux server doesn't allow non-official mirrors, so i had to find another one, now the vt.edu mirror doesn't appear to allow rsyncing either..........

Offline

#85 2007-04-16 14:58:48

cactus
Taco Eater
From: t͈̫̹ͨa͖͕͎̱͈ͨ͆ć̥̖̝o̫̫̼s͈̭̱̞͍̃!̰
Registered: 2004-05-25
Posts: 4,622
Website

Re: Throttling ftp.archlinux.org

crouse..

rsync -rtv --progress --delete-after mirrors.easynews.com::mirrors/linux/archlinux/current/os/i686/ /my/local/directory

I use the above to mirror current only. Adjust as needed..


"Be conservative in what you send; be liberal in what you accept." -- Postel's Law
"tacos" -- Cactus' Law
"t̥͍͎̪̪͗a̴̻̩͈͚ͨc̠o̩̙͈ͫͅs͙͎̙͊ ͔͇̫̜t͎̳̀a̜̞̗ͩc̗͍͚o̲̯̿s̖̣̤̙͌ ̖̜̈ț̰̫͓ạ̪͖̳c̲͎͕̰̯̃̈o͉ͅs̪ͪ ̜̻̖̜͕" -- -̖͚̫̙̓-̺̠͇ͤ̃ ̜̪̜ͯZ͔̗̭̞ͪA̝͈̙͖̩L͉̠̺͓G̙̞̦͖O̳̗͍

Offline

#86 2007-04-16 15:10:52

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: Throttling ftp.archlinux.org

crouse wrote:

Because I run a private mirror that i use as a repo for my 20+ machines and for some others .....I'm guessing between 50 - 100 machines total maybe. It was my way of taking stress off of the other mirrors and the main archlinux server.  The archlinux server doesn't allow non-official mirrors, so i had to find another one, now the vt.edu mirror doesn't appear to allow rsyncing either..........

I think the point toofishes was trying to make is this:

If you take your mirror machine, and nfs share the pacman pkg cache, then mount that on all machines, you might have to download packages every so often, but the net result will be less download, less disk space, and easier downgrades in the case that you want it.

For anything other than hundreds of machines that run the entire set of arch packages, it's probably better to use a shared package cache and download on demand, than to download everything.  Just to point out a few things - how many machines use kde, amarok, or vlc? Because you're downloading all of these ever time, regardless of usage.

Offline

#87 2007-04-16 17:05:10

crouse
Arch Linux f@h Team Member
From: Iowa - USA
Registered: 2006-08-19
Posts: 907
Website

Re: Throttling ftp.archlinux.org

I understood perfectly what he meant. I'm not sure everyone gets where I'm coming from.

I run several other websites on a dedicated server. (RHEL -- no easy way to change that.) I have approx 4,000 members on the different sites. Our server has a 2,500 gb per month limit. This server now mirrors the repos current,extra,community.  I allow some of the members of the sites access to this mirror (faster download speeds), these people are all over the world, and are not just on my local networks.

I have no idea phrakture , what machine uses what. Again, my mirror serves people all over the world, not just my own local network.   I have not checked any logs to see what people are using, I just mirror all of it.

rsync -avz --delete

only downloads the CHANGED files using rsync, yes, I'm aware that some package may never get used.  I would assume that the 50-100 machines that my mirror serves, more than makes up for the rsyncs (unless there are a bunch of changes in one day).  Eventually it would be nice if I can get a few more servers to open them up to the public, but right now, it's not feasible because of the price of bandwidth. I have a 2,500 gb limit per month, and if you recall the forum post from a month or two ago, someone told me that their mirror was sending out about 4-5,000 gb per month...... so I don't have enough dedicated bandwidth....... yet..... wink

I emailed Dale, he found me another mirror to use, it seems to work ok for now.

Thanks.

Offline

#88 2007-04-16 18:47:00

cactus
Taco Eater
From: t͈̫̹ͨa͖͕͎̱͈ͨ͆ć̥̖̝o̫̫̼s͈̭̱̞͍̃!̰
Registered: 2004-05-25
Posts: 4,622
Website

Re: Throttling ftp.archlinux.org

crouse wrote:
rsync -avz --delete

only downloads the CHANGED files using rsync,

yes, and it isn't as efficient as it could be..in the case of a package repository.

Lets say the repo contains a package called foo-2.1.1-1.pkg.tar.gz
You rsync that, and it mirrors. yay.
A new version is pushed. foo-2.1.2-1.pkg.tar.gz

That is a new filename. rsync doesn't know (with default options) that it relates to the earlier file. So, it just downloads the entire file. Then it deletes the old version.

To get rsync to only pull the differences (like xdelta) and to approximate what an older file might be (like checking for differences of only one character in the name or something), you need to use --fuzzy ..and something else. I can't remember what exactly. anyway..

rsync ran into this problem with the debian repos..and put out some patches to help address it.
--fuzzy requires a newer version of rsync, that many of the public rsync mirrors do not support (I think it requires something like protocol version 15 or 17 or something).

Just an FYI...


"Be conservative in what you send; be liberal in what you accept." -- Postel's Law
"tacos" -- Cactus' Law
"t̥͍͎̪̪͗a̴̻̩͈͚ͨc̠o̩̙͈ͫͅs͙͎̙͊ ͔͇̫̜t͎̳̀a̜̞̗ͩc̗͍͚o̲̯̿s̖̣̤̙͌ ̖̜̈ț̰̫͓ạ̪͖̳c̲͎͕̰̯̃̈o͉ͅs̪ͪ ̜̻̖̜͕" -- -̖͚̫̙̓-̺̠͇ͤ̃ ̜̪̜ͯZ͔̗̭̞ͪA̝͈̙͖̩L͉̠̺͓G̙̞̦͖O̳̗͍

Offline

#89 2007-04-23 17:18:41

Tapi
Member
From: France
Registered: 2007-02-12
Posts: 5

Re: Throttling ftp.archlinux.org

Hi,
I am the admin of mir1.archlinuxfr.org (which is an official mirror).
I wanted to ask : can official mirrors increase the frequency of rsync's ?
Since the mirror traffic is now more than twice the traffic before the throttling, I think it could be a good thing for users to synchronize official mirrors more often, thus trying to keep as up-to-date as possible.
This mirror currently synchronizes every 6 hours ; how often do you think it could be rsync'ed ?
Thanks in advance

Note for european users : this mirror (located in France, near Paris) has a 100mb connection, and is currently used at about 3-4% of its capacity. It supports HTTP, FTP and RSYNC. People who use it report it to be fast ; so if the mirror you're using is slow, give it a try... It's free.

Note for archlinux.org webmaster : I think it could be useful for users to specify which protocols are supported by each mirror on the mirror page (http://www.archlinux.org/download/). Something like :
    ftp.archlinux.org        United States   FTP HTTP
    mir1.archlinuxfr.org   France             FTP HTTP RSYNC
or whatever... Of course, it's just an idea. What do you think ?

Last edited by Tapi (2007-04-25 18:46:23)

Offline

#90 2007-04-23 18:42:33

toofishes
Developer
From: Chicago, IL
Registered: 2006-06-06
Posts: 602
Website

Re: Throttling ftp.archlinux.org

Tapi wrote:

Hi,
I am the admin of mir1.archlinuxfr.org (which is an official mirror).
I wanted to ask : can official mirrors increase the frequency of rsync's ?
Since the mirror traffic is now more than twice the traffic before the throttling, I think it could be a good thing for users to synchronize official mirrors more often, thus trying to keep as up-to-date as possible.
This mirror currently synchronizes every 6 hours ; how often do you think it could be rsync'ed ?
Thanks in advance

Note for european users : this mirror (located in France, near Paris) has a 100mb connection, and is currently used at about 3-4% of its capacity. It supports HTTP, FTP and RSYNC. People using it report it to be fast ; so if the mirror you're using is slow, give it a try... It's free.

Note for archlinux.org webmaster : I think it could be useful for users to specify which protocols are supported by each mirror on the mirror page (http://www.archlinux.org/download/). Something like :
    ftp.archlinux.org        United States   FTP HTTP
    mir1.archlinuxfr.org   France             FTP HTTP RSYNC
or whatever... Of course, it's just an idea. What do you think ?

I would say 6 hours is plenty fast and will keep you quite in sync.

Can you post a feature request on Flyspray for your second thought (protocols on the web page)? I think that is a good idea too, although collecting the data will be the hard part.

Offline

#91 2007-04-25 10:20:34

shining
Pacman Developer
Registered: 2006-05-10
Posts: 2,043

Re: Throttling ftp.archlinux.org

Tapi wrote:

Hi,
I am the admin of mir1.archlinuxfr.org (which is an official mirror).
I wanted to ask : can official mirrors increase the frequency of rsync's ?
Since the mirror traffic is now more than twice the traffic before the throttling, I think it could be a good thing for users to synchronize official mirrors more often, thus trying to keep as up-to-date as possible.
This mirror currently synchronizes every 6 hours ; how often do you think it could be rsync'ed ?

Thanks for this mirror, it's great : very fast and up to date smile
I'm not sure what's the frequency of other mirrors, and it also depends on how busy the main server is, but I would say that a sync every 6 hours is already a lot, so I don't think more is necessary.
The french debian mirrors synced only once per day, and that's plenty, since I also generally upgrade once per day.


pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))

Offline

#92 2007-04-25 11:35:51

hussam
Member
Registered: 2006-03-26
Posts: 572
Website

Re: Throttling ftp.archlinux.org

I use this method to rsync. current and extra

/etc/cron.hourly/sync

#!/bin/sh

  SYNCLOCKFILE="/var/lock/reposync.lock"

  if [[ -f $SYNCLOCKFILE ]]; then
    # lock file already present, bail
    echo "Synchronize job is already running..."
    exit 1
  fi
  cd /var/cache/mirror
  touch $SYNCLOCKFILE
 /usr/bin/reposync.sh
  rm -f $SYNCLOCKFILE

/usr/bin/reposync.sh

#/bin/sh
cd /var/cache/mirror/
export arch_mirror=rsync://mirror.pacific.net.au/archlinux/
rsync -aPvz --progress --delete --exclude=os/x86_64 --exclude=iso --delete-excluded $arch_mirror/current/  /var/cache/mirror/current/
rsync -aPvz --progress --delete --exclude=os/x86_64 --delete-excluded $arch_mirror/extra/  /var/cache/mirror/extra/
unset arch_mirror

Offline

#93 2007-04-25 18:51:17

Tapi
Member
From: France
Registered: 2007-02-12
Posts: 5

Re: Throttling ftp.archlinux.org

shining wrote:

Thanks for this mirror, it's great : very fast and up to date smile

I'm glad to see that you find it useful... wink

toofishes wrote:

Can you post a feature request on Flyspray for your second thought (protocols on the web page)? I think that is a good idea too, although collecting the data will be the hard part.

I just wrote a feature request on flyspray, let's hope it will be taken into account soon :
http://bugs.archlinux.org/task/7006

Offline

Board footer

Powered by FluxBB