You are not logged in.
It's in python, run it using:
repository_test.py /etc/pacman.d/unstable
(unstable is just an example)
It'll examine all the servers in the specified file and print 5 fastest sorted by access time. Time is measured based on login and get file listing.
Any suggestions/improvements are appreciated.
Enjoy:
#! /usr/bin/python
from ftplib import FTP
import sys
import urllib
import time
def timeCmd(cmd):
before = time.time();
try:
cmd();
except KeyboardInterrupt, ki:
raise ki
except Exception, e:
print 'ERROR: ', e
return 99999999
return time.time() - before;
def talkToServer(server, dir):
ftp = FTP(server)
ftp.login()
ftp.cwd(dir)
ftp.nlst()
def getFuncToTime(server, dir):
return lambda : talkToServer(server, dir)
def splitUrl(url):
server = urllib.splittype(url.strip())[1]
return urllib.splithost(server)
def cmpPairBySecond(p1, p2):
if p1[1] == p2[1]: return 0
if p1[1] < p2[1]: return -1
return 1
if __name__ == "__main__":
if len(sys.argv) != 2:
print 'Usage: ', sys.argv[0], ' <pacman-servers-list-file>'
sys.exit(0)
fl = open(sys.argv[1], 'r')
serverToTime = {}
for ln in fl.readlines():
splitted = ln.split('=')
if splitted[0].strip() != 'Server': continue
serverUrl = splitted[1]
if serverUrl[-1] == 'n': serverUrl = serverUrl[0:-1]
splittedUrl = splitUrl(serverUrl)
print 'Querying: ', splittedUrl[0], '...'
serverToTime[serverUrl] = timeCmd(getFuncToTime(splittedUrl[0], splittedUrl[1]))
#print 't',serverToTime[serverUrl]
items = serverToTime.items()
items.sort(cmpPairBySecond)
print '======================================'
print 'Servers sorted by time'
print '======================================'
for i in items[0:5]:
print i[0], ': ', i[1]
Offline
Is there any advantage using your script over sortmirrors.pl?
Offline
i was not aware of it...
ups... :oops: :oops:
Offline
i was not aware of it...
ups... :oops: :oops:
Well, it's not very clearly advertised..
I was just wondering if there was some improvements in your script.
Offline
i can think of only one iprovement - it requires nothing besides python
Offline
also, sortmirrors.pl has been buggy for a while because of an upstream bug in <I've forgotten the name of the package>.
Dusty
Offline
Netselect is the package that is used by sortmirrors (and has some flaws).
Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.
clean chroot building not flexible enough ?
Try clean chroot manager by graysky
Offline
yeah yeah, that one. Wonder why my brain misfiled it.
Dusty
Offline
drakosha's script connects to the ftp server, sortmirror "only" pings them (afaik).
maybe the python script can be modified to fetch a file from the ftp and thereby sort the servers by their actual download speed, which i'd prefer to sorting it by ping respond times.
Offline
It's more than just a ping, it's also doing "ls" on the server, which is supposed to be a long list.
The problem with downloading a file is to know what file to download:
1. it must be there
2. it must be not too big
Offline
Wonder why my brain misfiled it.
Because you knew it would be easy to find next time it was needed - I do it all the time, I think it's a subconscious thing.
Offline
Sorry - back on topic - I like it. Apart from netselect's bugginess, I never liked the fact that sortmirror.pl actually sorted the mirrors - I wanted it to do what you're doing i.e. tell me the best mirrors, so I can edit the pacman.d files myself. Your script correctly selected heanet as my fastest mirror for current/extra/community, with varying results after that, but generally within the same group of 6-7.
Offline
Cool,
It seems that drakosha's script is good improvement over the official script. I'm gonna try it at home, let's see if it brings better dl speeds.
Offline
I also find that it's a better method than netselect. When I launch sortmirrors I always get as first server a strange one:
ERROR: 550 /pub/linux/distributions/archlinux/community/os/i686: No such file or directory.
ERROR: 530 Login incorrect.
Or Error 404...
P.S. I really start to like python .
Offline
I added a comment to the bug report, maybe this will go official.
http://bugs.archlinux.org/task/2952
Nice introduction to the forums, drakosha, a nice meaningful contribution. :-)
Dusty
Offline
Hi,
it could have another major advantage if it would be able to connect over proxy... as people behind a proxy can't use netselect.
I would really appreciate proxy support.
It is at least in advantage on some routed systems with NAT which also don't allow what netselect does.
Regards,
Ford Prefect
Offline
First of all thanks for all the positive feedbacks - it gives a really good feeling to read them
2nd - feel free to add suggestions here, i'll try to implement those.
About proxy suggestion: is there a standard way to define a proxy? Some env. variables? Some other way?
Offline
I was looking today over the mirrors list in /etc/pacman.d/* and I couldn't help not noticing that most of them (probably except 1 or 2) were FTP sites. I think connection establishment over FTP is somehow slower than over HTTP. Anyhow, I started a discussion on this matter on the maillist.
At the same time, I've discovered that some mirrors might not be up to date. You could replace or add to the current method of detecting the speed of the download something like: downloading reponame.db.tar.gz - I don't know if that file holds the last update tune, but you could use the file time to show the age of the repo. A could might be the fastest, but could be outdated at the same time. Sometimes, the db file might be too small to check the speed, so maybe a ls is more appropriate then.
:: / my web presence
Offline
Hi,
the standard way to define proxys in the command line are indeed environmental variables. These are ftp_proxy, http_proxy and they are well recognized (ie: wget, mplayer, ...).
These variables are to be set in URL form. Here are examples from my setup:
http_proxy=http://proxy:8080/
ftp_proxy=http://proxy:8080/
As you see, it is possible to define an http proxy for ftp access. Best would be to use an ftp library which comes with in-house proxy-support I guess.
cu
Ford Prefect
Offline
I'd just like to point out the obvious contradiction that occurs from a script like this. As more people use it and switch to the faster mirrors, these mirrors will become slower. Just playing devil's advocate Very nice script.
I am a gated community.
Offline
I'd just like to point out the obvious contradiction that occurs from a script like this. As more people use it and switch to the faster mirrors, these mirrors will become slower. Just playing devil's advocate
Very nice script.
Not strictly true -- the download speed from the mirror depends on your location and other factors.
Dusty
Offline
After digging in python docs, it looks like it's easily doable (proxy) with urllib2. New version soon
Offline
I also have sometimes problems with not updated mirrors. Sometimes I get a kernel update but the modules aren't yet updated, or some dependancy which won't resolve because a package doesn't exist yet.
Is there a way to find the last update date ? Maybe with the package db ?
EDIT: I just mean to check if the repository isn't too old (< 3 days ?). Maybe you can just check the last edit date with "ls -l current.db.tar.gz". But can we consider that there is everyday edited packages ?
Offline
i don't know how to do it, pacman expert to resque?
Offline
Changelog:
* added command line arguments
* should honor http/ftp proxies defined via env variables
Make sure it's 1.2 that you use - i updated this post at least twice!
enjoy
#! /usr/bin/python
# ver 1.2
import urllib2
import sys
import time
from optparse import OptionParser
def createOptParser():
parser = OptionParser()
parser.add_option("-s", "--server-number", default=5,
dest="server_number",
help="amount of servers to print, 0 for all")
parser.add_option("-v", "--verbose",
action="store_true", dest="verbose", default=False,
help="be verbose")
return parser
def timeCmd(cmd):
before = time.time();
try:
cmd();
except KeyboardInterrupt, ki:
raise ki
except Exception, e:
print 'tERROR: ', e
return 99999999
return time.time() - before;
def talkToServer(serverUrl):
opener = urllib2.build_opener()
tmp = opener.open(serverUrl).read()
def getFuncToTime(serverUrl):
return lambda : talkToServer(serverUrl)
def cmpPairBySecond(p1, p2):
if p1[1] == p2[1]: return 0
if p1[1] < p2[1]: return -1
return 1
if __name__ == "__main__":
parser = createOptParser()
(options, args) = parser.parse_args()
if len(args) != 1:
parser.print_help()
sys.exit(0)
fl = open(args[0], 'r')
serverToTime = {}
print 'Querying servers, it might take some time '
for ln in fl.readlines():
splitted = ln.split('=')
if splitted[0].strip() != 'Server': continue
serverUrl = splitted[1]
if serverUrl[-1] == 'n': serverUrl = serverUrl[0:-1]
if not options.verbose: print '*',
else: print serverUrl, '...',
#sys.stdout.flush()
serverToTime[serverUrl] = timeCmd(getFuncToTime(serverUrl))
if options.verbose: print 't',serverToTime[serverUrl]
items = serverToTime.items()
items.sort(cmpPairBySecond)
numberOfItemsToShow = int(options.server_number)
if numberOfItemsToShow == 0: numberOfItemsToShow = len(items)
if len(items) > 0:
if not options.verbose: print
print '======================================'
print 'Servers sorted by time'
print '======================================'
for i in items[0:numberOfItemsToShow]:
print i[0], ': ', i[1]
Offline