better TCP performance over a “high delay network”

by Gil   Last Updated August 07, 2019 20:00 PM

I’m trying to improve my TCP throughput over a “high delay network” between Linux machines.

I set tcp_mem, tcp_wmem and tcp_rmem to “8192 7061504 7061504”.
I set rmem_max, wmem_max, rmem_default and wmem_default to “7061504”.
I set netdev_max_backlog and txqueuelen to 10000.
I set tcp_congestion_control to “scalable”.

I’m using “nist” (cnistnet) to simulate a delay of 100ms, and the BW I reach is about 200mbps (without delay I reach about 790mbps).

I’m using iperf to perform the tests and TCPTrace to analyze the results, and here is what I got:

On the receiver side:
max win adv: 5294720 bytes
avg win adv: 5273959 bytes
sack pkts sent: 0

On the sender side:
actual data bytes: 3085179704
rexmt data bytes: 9018144
max owin: 5294577 bytes
avg owin: 3317125 bytes
RTT min: 19.2 ms
RTT max: 218.2 ms
RTT avg: 98.0 ms

Why do I reach only 200mbps? I suspect the “owin” has something to do with it, but I’m not sure (these results are of a test of 2 minute. A 1 minutes test had an “avg owin” of 1552900)…

Am I wrong to expect the throughput to be almost 790mbps even if the delay is 100ms?

(I tried using bigger numbers in the window configurations but it didn't seem to have an effect)



Answers 5


The site

http://www.psc.edu/networking/projects/tcptune/

mentions that as Linux nowadays autotunes TCP settings, messing with the values will likely not improve things.

That being said, maybe 100 ms together with a large bandwidth (at least 790 mbps) might lead to an enormous BDP, so maybe the autotuning decides that something is wrong and doesn't go far enough..

janneb
janneb
August 11, 2011 12:30 PM

Try setting the iperf window size to really mach the bandwidth-delay-product of that link. So avg. RTT * 1Gbps should give you 10MB roughly. See if that improves things.

pfo
pfo
August 11, 2011 13:13 PM

The only way you can really start to understand what is going on is to get more data -- else you are just guessing or asking other people to guess. I recommend getting a system level view (cpu, memory, interrupts etc) with sar from the iostat package. Also, you should get a packet dump with Wireshark or tcpdump. You can then use Wireshark to analyze it as it has a lot of tools for this. You can graph the window size over time, packet loss, etc.

Even a little packet loss on a high latency link tends to hurt bandwidth quite a bit. Alhough being simulated -- this is a bit strange. Lots of small packets might also cause high interrupts (even though those might be simulated as well?).

So in short, get TCPDump and Sar to see what is going on at the packet level and with your system resources.

Kyle Brandt
Kyle Brandt
August 11, 2011 13:44 PM

This is a common TCP issue called "Long Fat Pipe". If you Google that phrase and TCP you'll find a lot of information on this problem and possible solutions.

This thread has a bunch of calculations and suggestions on tuning the Linux TCP stack for this sort of thing.

3dinfluence
3dinfluence
August 11, 2011 14:11 PM

How much memory does this machine have? The tcp_mem settings seems to be insane, it configured 28gb (7061504 * 4kb) for TCP data globally. (But this is not your perf problem since you most likely do not hit that limit on a few-socket test run. Just wanted to mention it since setting tcp_mem to tcp_xmem values shows a very common missconception).

The 7mb you have configured for default seems ok. The maximum however can go up much higher on large delay pipes. For testing I would use 64MB as the max number for tcp_wmem and tcp_rmem, then you can rule out that this is your limiting factor. (This does bloat your buffers, so it only works if you have limited concurrency and the connection has low jitter and drops).

eckes
eckes
August 07, 2019 19:27 PM

Related Questions


Analyzing server performance

Updated May 03, 2016 10:00 AM

Diagnose Sudden Slowness on Linux

Updated March 12, 2018 18:00 PM

How increase write speed of raid1 mdadm?

Updated February 12, 2017 14:00 PM

Diagnosing slow/timing-out web application

Updated September 22, 2017 18:00 PM

At what point is a server considered idle?

Updated February 09, 2018 00:00 AM