You are not logged in.
I'm making a PKGBUILD but I cannot manage to download a file with curl/wget while it works with any web browser:
http://www.soronline.net/downloads/SorR50a.zip
Curl gives me
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="http://soronline.net">here</a>.</p>
<hr>
<address>Apache Server at www.soronline.net Port 80</address>
</body></html>
If I follow the redirections, I end up downloading the main page as you can see.
I've tried setting the same user agent as my web browser: did not work.
The trace:
== Info: Hostname was NOT found in DNS cache
== Info: Trying 192.254.233.89...
== Info: Connected to www.soronline.net (192.254.233.89) port 80 (#0)
=> Send header, 246 bytes (0xf6)
0000: GET /downloads/SORRv5.zip HTTP/1.1
0024: User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.6+ (
0064: KHTML, like Gecko) Chromium/23.0.1271.95 Chrome/23.0.1271.95 Saf
00a4: ari/537.6+ dwb/commit 2014-01-03 51681f6
00ce: Host: www.soronline.net
00e7: Accept: */*
00f4:
<= Recv header, 20 bytes (0x14)
0000: HTTP/1.1 302 Found
== Info: Server nginx/1.4.4 is not blacklisted
<= Recv header, 21 bytes (0x15)
0000: Server: nginx/1.4.4
<= Recv header, 37 bytes (0x25)
0000: Date: Thu, 06 Feb 2014 10:18:17 GMT
<= Recv header, 45 bytes (0x2d)
0000: Content-Type: text/html; charset=iso-8859-1
<= Recv header, 21 bytes (0x15)
0000: Content-Length: 271
<= Recv header, 24 bytes (0x18)
0000: Connection: keep-alive
<= Recv header, 32 bytes (0x20)
0000: Location: http://soronline.net
<= Recv header, 2 bytes (0x2)
0000:
<= Recv data, 271 bytes (0x10f)
0000: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">.<html><head>.
0040: <title>302 Found</title>.</head><body>.<h1>Found</h1>.<p>The doc
0080: ument has moved <a href="http://soronline.net">here</a>.</p>.<hr
00c0: >.<address>Apache Server at www.soronline.net Port 80</address>.
0100: </body></html>.
== Info: Connection #0 to host www.soronline.net left intact
On this website, only .zip files fail this way, everything works for other extensions.
Any clue?
Last edited by Ambrevar (2014-02-07 19:22:38)
Offline
Deleted
Last edited by skunktrader (2014-02-06 10:44:12)
Offline
Are you sure it works with a web browser. I tried to open http://www.soronline.net/downloads/SorR50a.zip in firefox, but it just redirected me to the home page.
Offline
Wow, that's weird: I've tried to copy paste in a web browser and it does not work indeed!
However, if you go at the download page and download the second link Download Update Patch v5.0 -> v5.0a (51.8Mb) then it works.
I've tried without cookies, it still works. I guess my HTTP skills are not good enough to understand what is going on.
Offline
302 indicates that the file moved temporarily. The 302 message should include a redirect to where the file has been moved temporarily. This might be done to keep the file available while performing maintenance on the primary site. Web browsers will follow the redirect, but will not cache the file from the temporary location (I think). Future attempts to retrieve the file will go back to the original location.
My guess is that curl and wget do not automatically follow the redirect, but will return the headers and let you figure out what to do about it. I would look at the documentation for those programs. I have not, and I am off to a meeting, so I shan't do it now
Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way
Offline
Maybe I missed something, but as I told in my first post, I've tried to follow redirections as well. Wget follows redirections by default, while curl needs to be called with the -L option.
Oddly enough, the redirection is toward the main page, not the file.
Last edited by Ambrevar (2014-02-06 16:53:56)
Offline
Just goes to show that I should not post in a hurry before I race off to a meeting.
Are you behind a corporate firewall? There is a strong tendency to interfere with zip transfers by many organizations. They like to run them through scumware checks.
Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way
Offline
Nope, no firewall in there. Tried from different locations as well. Besides it works with any web browser without proxy. But not with direct links!
Last edited by Ambrevar (2014-02-06 17:48:15)
Offline
@ ewaller,
I have tested it as well (no firewall either), and it does seem to work only when manually clicking on a download link. No download agent works, even with usergent spoofing (I even tried aria2c).
This page hosts all the downloads: http://www.soronline.net/downloads/
Clicking on either of those via a browser initiates the download. But if you copy and paste any of those links in the browser, you are redirected to the home page. I can see that it does not depend on cookies.
So, my guess is that it depends on some sort of click handler, though I cannot see any javascript on the above page.
Last edited by x33a (2014-02-06 17:50:50)
Offline
Alright, playing around a bit more, I found that apparently they are checking the referer.
So, either of these works:
$ wget http://soronline.net/downloads/SorR50a.zip --referer=http://soronline.net
or
curl http://soronline.net/downloads/SorR50a.zip --referer http://soronline.net/ -O
I don't know why they do that. Since they are encouraging the use of a download manager -> http://soronline.net/essential_files.htm
Maybe they don't want to provide programmatic access.
Offline
Damn, reading from the man page, that was pretty obvious! There are simply too many options to wget and curl... I guess we should blame http for that!
Thank you very much for this one, that was very instructive!
Offline