You are not logged in.

#1 2012-01-27 22:34:36

firecat53
Member
From: Lake Stevens, WA, USA
Registered: 2007-05-14
Posts: 1,542
Website

[SOLVED] FTP download stuck at EOF (wget, python, lftp)

I'm having an odd problem with a largish FTP download. Using wget, python ftplib and lftp, when I download this particular file (~241 MB of text/CSV information), the entire file will download but then it stops and has to be manually killed. This same file downloads perfectly from another Arch machine in a different location. I ran pdb on the python download and it hung here: (if that means anything to someone)

ipdb> 
> /usr/lib/python2.7/socket.py(447)readline()
    446                 try:
--> 447                     data = self._sock.recv(self._rbufsize)
    448                 except error, e:

The offending machine is on Comcast for an ISP, and is hooked up to the internet via a consumer router and a switch. Even more confusing, a smaller file (~ 6 MB) from the same location (unfortunately I can't give out the url for anyone to test, as it's work-related) downloads just fine.

Anyone have any ideas about where to begin with troubleshooting? I'm about ready to just stick that download in a separate thread and kill the thread when it looks like the download is complete!

Thanks!
Scott

edit: I was slightly mistaken...the file doesn't get completely downloaded. It's missing the last line completely plus about the last 1032 characters of the previous line.

Last edited by firecat53 (2012-01-30 03:34:28)

Offline

#2 2012-01-28 01:01:36

falconindy
Developer
From: New York, USA
Registered: 2009-10-22
Posts: 4,111
Website

Re: [SOLVED] FTP download stuck at EOF (wget, python, lftp)

Heh. This sounds extremely familiar:

http://projects.archlinux.org/pacman.gi … 4f146f232b

Try using curl, which enables keepalives by default.

Offline

#3 2012-01-28 04:03:17

firecat53
Member
From: Lake Stevens, WA, USA
Registered: 2007-05-14
Posts: 1,542
Website

Re: [SOLVED] FTP download stuck at EOF (wget, python, lftp)

Thanks falconindy! Well...it sort of works:

curl -O ftp://user:pw@url/itemx3.out
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  240M  100  240M    0     0   314k      0  0:13:04  0:13:04 --:--:--     0^[
curl: (28) FTP response timeout

After the timeout, it appears that the file is all there...which is a good thing!! But it's still timing out and not finishing normally.
I also tried (per a stackoverflow question):

import socket
from ftplib import FTP as ftp
ftps = ftp(URL, USER, PW)
ftps.sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
ftps.retrbinary("RETR {}{}".format(FTP_PATH, FILENAME, open(SAVE_PATH, 'wb').write)

But this didn't even timeout to give me the entire file...it just hung and cutoff the last line and a half or so, like before.

I guess for now I'll try the curl method with subprocess.Popen and ignore the timeout.

Any other ideas to try and lose the timeout/hang?

Thanks!
Scott

Offline

#4 2012-01-28 04:20:08

falconindy
Developer
From: New York, USA
Registered: 2009-10-22
Posts: 4,111
Website

Re: [SOLVED] FTP download stuck at EOF (wget, python, lftp)

Well, you'll want to set the two additional tuning knobs that Linux provides -- TCP_KEEPINTVL and TCP_KEEPIDLE. curl lets you do this with the --keepalive-time=N flag (the value 'N' is applied to both KEEPINTVL and KEEPIDLE) and alters the way-too-long default values. I can't find it in documentation but I think KEEPIDLE defaults to 7200 seconds meaning that keep alives aren't sent for 2 hours. Try something like 60 and see if that makes curl happier.

Last edited by falconindy (2012-01-28 04:20:45)

Offline

#5 2012-01-28 05:41:02

firecat53
Member
From: Lake Stevens, WA, USA
Registered: 2007-05-14
Posts: 1,542
Website

Re: [SOLVED] FTP download stuck at EOF (wget, python, lftp)

That killed the curl timeout, but why would I be getting this warning from inside a fully updated Arch system (tested on 2 different systems...x86_64 and i686)??

$ curl -O --keepalive-time 60 ftp://user:pw@url/itemx3.out
Warning: Keep-alive functionality somewhat crippled due to missing support in 
Warning: your operating system!
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0Warning: Keep-alive functionality somewhat crippled due to missing support in 
Warning: your operating system!
100  240M  100  240M    0     0   423k      0  0:09:41  0:09:41 --:--:--  374k
$ 

Is that normal?

Thanks!
Scott

Hmmm...now I've gotta see if I can set that same keepalive from within python somehow.

Edit: I see that the SO_KEEPALIVE socket setting can be set using 'setsockopt', but I don't see anything about adjusting the keepalive interval. Am I missing it in the the setsockopt manpage?

Last edited by firecat53 (2012-01-28 05:47:29)

Offline

#6 2012-01-28 12:55:11

falconindy
Developer
From: New York, USA
Registered: 2009-10-22
Posts: 4,111
Website

Re: [SOLVED] FTP download stuck at EOF (wget, python, lftp)

Huh. That warning isn't right at all. Looks like a regression from when the code for the CLI tool was split up into a bunch of different files. The netinet/tcp.h header isn't included and those constants aren't defined. Incidentally, my patch to pacman sparked me to file a pair of patches with curl to move control of tcp keepalives to the library side (it's currently done via a socket callback in the front end tool). Those will be merged, which means this regression will be silently overlooked. I'll backport the fix for us.

The idle and intvl options aren't in setsockopt(3P) because they're not POSIX options. see tcp(7). Don't know offhand if python will have these options available.

late update: curl 7.24.0-2 in testing properly sets the keepalive knobs.

Last edited by falconindy (2012-01-28 15:36:09)

Offline

#7 2012-01-30 01:07:18

firecat53
Member
From: Lake Stevens, WA, USA
Registered: 2007-05-14
Posts: 1,542
Website

Re: [SOLVED] FTP download stuck at EOF (wget, python, lftp)

I tested your fix in 7.24.0-2 and it works fine...no error messages. I've updated my code to use curl for now.

I also tried this in my python code with ftplib:

import socket
ftps=ftplib.FTP(url, user, password)
ftps.sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
ftps.sock.setsockopt(socket.SOL_SOCKET, socket.TCP.KEEPINTVL, 60)
ftps.retrbinary(......

but it still didn't work (hangs right before the download completes). Did I set that TCP.KEEPINTVL correctly? It seems like if that's the issue and the --keepalive-time fixes the problem using curl, that a similar fix should work with python. I'm running at the edge of my networking knowledge here...

Thanks!
Scott

Last edited by firecat53 (2012-01-30 01:08:22)

Offline

#8 2012-01-30 01:11:10

falconindy
Developer
From: New York, USA
Registered: 2009-10-22
Posts: 4,111
Website

Re: [SOLVED] FTP download stuck at EOF (wget, python, lftp)

You need to set TCP_KEEPIDLE as well. This is actually the more important of the two tuning knobs since it determines when keepalive probes start getting sent.

Last edited by falconindy (2012-01-30 01:11:51)

Offline

#9 2012-01-30 03:33:55

firecat53
Member
From: Lake Stevens, WA, USA
Registered: 2007-05-14
Posts: 1,542
Website

Re: [SOLVED] FTP download stuck at EOF (wget, python, lftp)

Awesome, I got it working with python...figured out I needed the levels set a little differently for the TCP variables + needed the TCP_KEEPIDLE variable. Like:

import socket
ftps.sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
ftps.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP.KEEPINTVL, 75)
ftps.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP.KEEPIDLE, 60)

Thanks so much for sticking with me, falconindy!

Scott

Offline

Board footer

Powered by FluxBB