You are not logged in.

#1 2011-10-29 17:18:25

nullvoid
Member
Registered: 2009-01-18
Posts: 33

Broken TCP stack in latest kernel when under heavy load

I'm running an Arch box with a decent amount of HTTP traffic. After upgrading to the latest kernel I've seen that packets are send from the wrong source and destination address. This only applies during heavy load (100+ requests per second). tcpdump shows the following:

18:52:58.512573 IP 0.0.0.0.80 > 0.0.0.0.4316: Flags [FP.], seq 0, ack 1, win 14400, length 0
18:52:58.512600 IP 0.0.0.0.80 > 0.0.0.0.56546: Flags [FP.], seq 0, ack 1, win 14400, length 0
18:52:58.512621 IP 0.0.0.0.80 > 0.0.0.0.4535: Flags [FP.], seq 0, ack 1, win 14600, length 0
18:52:58.512641 IP 0.0.0.0.80 > 0.0.0.0.3528: Flags [FP.], seq 0, ack 1, win 14600, length 0
18:52:58.512662 IP 0.0.0.0.80 > 0.0.0.0.4509: Flags [FP.], seq 0, ack 1, win 14400, length 0
18:52:58.512682 IP 0.0.0.0.80 > 0.0.0.0.65040: Flags [FP.], seq 0, ack 1, win 14600, length 0
18:52:58.512702 IP 0.0.0.0.80 > 0.0.0.0.2455: Flags [FP.], seq 0, ack 1, win 10240, length 0
18:52:58.512722 IP 0.0.0.0.80 > 0.0.0.0.16545: Flags [FP.], seq 0:268, ack 1, win 15008, length 268
18:52:58.519258 IP 0.0.0.0.80 > 0.0.0.0.29802: Flags [FP.], seq 0:268, ack 1, win 980, options [nop,nop,TS val 745514 ecr 1317559555], length 268
18:52:58.565907 IP 0.0.0.0.80 > 0.0.0.0.32376: Flags [FP.], seq 0, ack 1, win 14400, length 0
18:52:58.619241 IP 0.0.0.0.80 > 0.0.0.0.50493: Flags [FP.], seq 0:268, ack 1, win 11256, options [nop,nop,TS val 745544 ecr 9539361], length 268
18:52:58.805927 IP 0.0.0.0.80 > 0.0.0.0.20852: Flags [FP.], seq 3025419976:3025420244, ack 3037671074, win 967, options [nop,nop,TS val 745600 ecr 6445640], length 268
18:52:58.805953 IP 0.0.0.0.80 > 0.0.0.0.65025: Flags [FP.], seq 1663827778:1663828046, ack 2127675352, win 707, options [nop,nop,TS val 745600 ecr 457812708], length 268
18:52:58.845918 IP 0.0.0.0.80 > 0.0.0.0.2217: Flags [FP.], seq 0:268, ack 1, win 707, options [nop,nop,TS val 745612 ecr 546643], length 268
18:52:59.099245 IP 0.0.0.0.80 > 0.0.0.0.5112: Flags [FP.], seq 0:268, ack 1, win 15008, length 268
18:52:59.152582 IP 0.0.0.0.80 > 0.0.0.0.1175: Flags [FP.], seq 0:268, ack 1, win 15008, length 268
18:52:59.232612 IP 0.0.0.0.80 > 0.0.0.0.47217: Flags [FP.], seq 684621876:684622144, ack 3544859356, win 11256, length 268
18:52:59.659258 IP 0.0.0.0.80 > 0.0.0.0.3098: Flags [FP.], seq 2105858244:2105858512, ack 3896053916, win 980, options [nop,nop,TS val 745856 ecr 52041], length 268
18:52:59.659290 IP 0.0.0.0.80 > 0.0.0.0.3099: Flags [FP.], seq 18772067:18772335, ack 2568646283, win 980, options [nop,nop,TS val 745856 ecr 52041], length 268
18:52:59.759244 IP 0.0.0.0.80 > 0.0.0.0.18780: Flags [FP.], seq 0:268, ack 1, win 707, options [nop,nop,TS val 745886 ecr 168876], length 268
18:52:59.845907 IP 0.0.0.0.80 > 0.0.0.0.58449: Flags [FP.], seq 0, ack 1, win 980, options [nop,nop,TS val 745912 ecr 528058426], length 0
18:52:59.925936 IP 0.0.0.0.80 > 0.0.0.0.65137: Flags [FP.], seq 0:268, ack 1, win 15008, length 268
18:52:59.979497 IP 0.0.0.0.80 > 0.0.0.0.2920: Flags [FP.], seq 0:268, ack 1, win 980, options [nop,nop,TS val 745952 ecr 18879], length 268
18:52:59.979527 IP 0.0.0.0.80 > 0.0.0.0.2922: Flags [FP.], seq 0:268, ack 1, win 980, options [nop,nop,TS val 745952 ecr 18879], length 268
18:52:59.979553 IP 0.0.0.0.80 > 0.0.0.0.2940: Flags [FP.], seq 0:268, ack 1, win 980, options [nop,nop,TS val 745952 ecr 18879], length 268

Source and destination ports are correctly set. Wireshark shows the correct HTML inside the packets that are returned to 0.0.0.0. The web server log also looks normal; the correct IP address is displayed and logged as a successful request.

When dropping incomming traffic on port 80 on eth0 everything works as expected (when requesting the server on eth1, which otherwise fails).

I'm running on "Linux srv 3.0-ARCH #1 SMP PREEMPT Wed Oct 19 12:14:48 UTC 2011 i686" which is the latest kernel in the repos. When booting the fallback image this problem does not exist, all packets are correctly addressed no matter how much load I put on the server.

Does anyone else have this problem?

Edit:
Running lighttpd 1.4.29. No tweaked kernel/TCP parameters whatsoever.

Last edited by nullvoid (2011-10-29 17:19:57)

Offline

#2 2011-11-07 11:18:05

nullvoid
Member
Registered: 2009-01-18
Posts: 33

Re: Broken TCP stack in latest kernel when under heavy load

Did a full reinstall of Arch on another machine and the problem still persist. Tried with Apache and Nginx, same behaviour as with Lighttpd. Could anyone else using an arch box under heavy load see if there's activity from 0.0.0.0?

Hint:
# tcpdump -n host 0.0.0.0

I'll do a bug report upstream later today.

Offline

#3 2012-09-18 16:31:28

ara4n
Member
Registered: 2012-09-18
Posts: 1

Re: Broken TCP stack in latest kernel when under heavy load

nullvoid: did you ever find the cause of this?  I'm seeing the same misbehaviour on a stock 3.2.6 kernel...

Offline

#4 2012-09-18 23:01:00

nomorewindows
Member
Registered: 2010-04-03
Posts: 3,362

Re: Broken TCP stack in latest kernel when under heavy load

Which kernel are you using?


I may have to CONSOLE you about your usage of ridiculously easy graphical interfaces...
Look ma, no mouse.

Offline

#5 2012-09-18 23:48:56

cybertorture
Member
Registered: 2010-05-05
Posts: 339

Re: Broken TCP stack in latest kernel when under heavy load

No respect for old and long forgotten topics big_smile


O' rly ? Ya rly Oo

Offline

Board footer

Powered by FluxBB