[1722] in linux-net channel archive
Re: Performance Enhancements to NFS client.
daemon@ATHENA.MIT.EDU (Dave Platt)
Sun Jan 28 19:10:55 1996
From: dplatt@jumble.3do.com (Dave Platt)
Date: Sun, 28 Jan 1996 12:00:34 PST
To: linux-net@vger.rutgers.edu
> Including, say, multiple reads or writes at once (e.g. for a 64k read
> sending up to 8 read requests out in a row rather than send one, wait
> for reply, send next, ...)? That's what I was going to focus on.
Best be VERY careful about this... at the least, make it an optional feature.
The reason I say this is that you may not like the result, in practice.
When NFS runs over UDP (as it usually does) it becomes exquisitely (I'll say
"excruciatingly") sensitive to the loss of any packets. If you lose even
one of the packets in an RPC reply, the entire RPC times out, the other
packets are discarded, and the whole RPC must be retried.
Since UDP is a datagram-oriented protocol, it has little or no flow control.
If you send 8 NFS read requests back-to-back, and the server happens to have
the data already in memory, the server will probably be able to queue up
several dozen reply packets for transmission at one shot. If the server has
a fast Ethernet interface (e.g. a Lance) it'll be capable of transmitting
these packets "back to back" over the Ethernet. If the client machine, or
any intermediate bridge or router, isn't capable of receiving such a
gigantic burst of data _very_ reliably, some packets will probably be lost,
and you'll see an RPC timeout and a need to repeat the RPC. Throughput will
suffer, probably quite badly.
I rather suspect that the default 8K transaction size of a typical NFS
transaction was selected with this concern in mind. 8k is little enough
data that it isn't likely to overflow the receive buffer on a typical
Ethernet card, even if the host isn't able to drain the first packet out of
the buffer before the last one arrives. Go up to 16k of data at a time, and
you'll probably start seeing packet at some client machines (slower ones, or
ones with smaller on-card buffers). At 64k, it'll be "death and
destruction" time.
There are a couple of alternative techniques which might help. Both involve
using a sliding-window technique to avoid excessive data bursts on the wire.
[1] Continue to use RPC-over-UDP, do send multiple requests, but keep the
number and size of these requests relatively small. For example, send
two 4k-byte requests instead of one 8k-byte request. When you get the
last packet back for the first request, you can request the next 4k.
Whether this approach will help all that much, will probably depend on
whether you have a low-latency or high-latency connection between client
and server. For a low-latency connection (e.g. same Ethernet segment) I
suspect it won't help much, and will probably hurt (doubling the number
of RPCs). For long-latency connections, it might very well help.
[2] Run RCP-over-TCP, thus gaining the advantage of the TCP sliding window,
slow-start, and congestion-avoidance. With this approach, you could
send multiple 8k read requests back-to-back over a single TCP connection.
The TCP transport layer would not allow the reply segments to swamp the
net, and if a segment _should_ be lost it would be re-transmitted by
TCP without having to have the entire RPC re-executed. This has the
disadvantage of single-threading your client/server mount, though.
I have an old ACS 2100 Ethernet bridge/router sitting around which I no
longer use. It's fast enough to keep up with a Sparc-1, but it can't handle
long bursts of back-to-back packets at the speed of a Sparc-10 or a
Lance-based Linux card. As a result, it can't handle standard NFS (8k
requests) without causing massive numbers of "NFS timeout" messages... I
can't get more than about 10k bytes/second through it, if I try to NFS-copy
a file from Sparc-10 to Sparc-10. On the other hand, it handles TCP
connections just fine... FTP works like a champ.
Similarly, when we had some networking problems a few years ago (somebody
added cable to our thin Ethernet without telling me, and pushed it over the
length limit) NFS became almost unusable, but TCP-based protocols showed no
perceptible degradation in performance.
Large-request NFS is a _wonderful_ canary in the coal mine - it will show up
any congestion points in your IP network very quickly.