2-Dec-96  7:19:20-GMT,3291;000000000005
Received: (from jaltman@localhost) by watsun.cc.columbia.edu (8.8.3/8.8.3) id CAA18414 for fdc; Mon, 2 Dec 1996 02:19:19 -0500 (EST)
Date: Mon, 2 Dec 1996 02:19:19 -0500 (EST)
From: Jeffrey Altman <jaltman@watsun.cc.columbia.edu>
Message-Id: <199612020719.CAA18414@watsun.cc.columbia.edu>
To: fdc@watsun.cc.columbia.edu

Newsgroups: comp.protocols.tcp-ip
Path: news.columbia.edu!news.columbia.edu!panix!feed1.news.erols.com!howland.erols.net!news-peer.gsl.net!news.gsl.net!ix.netcom.com!netcom.com!nagle
From: nagle@netcom.com (John Nagle)
Subject: Re: Nagle - TCP_NODELAY - transaction oriented applications
Message-ID: <nagleE1r4tF.2y7@netcom.com>
Organization: NETCOM On-line Communication Services (408 261-4700 guest)
References: <32a27b97.86620079@philos.philosys.de> <57filf$mf2@noao.edu>
Date: Sun, 1 Dec 1996 20:27:15 GMT
Lines: 43
Sender: nagle@netcom6.netcom.com
Xref: news.columbia.edu comp.protocols.tcp-ip:46498

rstevens@noao.edu (W. Richard Stevens) writes:
>> Is an application required to turn off the Nagle algorithm by setting
>> TCP_NODELAY in order to get good transaction performance?

>No.  If the client writes all of its request to the server at once
>(e.g., one write() or one writev()) then Nagle shouldn't be a problem.

>Where Nagle interacts with a transaction client-server is when the client
>sends its request as two write()s, say a small header followed by some
>data.  In that case the data won't be sent when write() is called the
>second time, because of Nagle, and to make it worse, the server will
>delay the ACK of the first small packet because there is nothing to send
>back (until the server gets the second write).

     Having invented the thing, I suppose I should say something.

     The basic problem is that either delayed ACKs or the Nagle algorithm
are OK, but the combination of the two is bad.  Historically, the Nagle 
algorithm came first, and it was put in to prevent tinygram floods, typically
caused by an application doing lots of 1-byte writes into a slow link.
Each of those writes got blown up to a full packet, and then some of them
would get dropped due to congestion, and then TCP would start retransmitting,
and on slow links, the TCP connection might even time out.  In any case,
you'd get a huge increase in traffic when this occured, enough to cause
congestion collapse in low-bandwidth nets.  (This was back in 1983,
remember.)

     Delayed ACKs went in TCP after I was out of networking, and I'm not too
happy about how it was done.  A delayed ACK is a bet; it's a gamble
that something will be sent in the near future on which the ACK can
be piggybacked.  Sometimes you win, sometimes you lose.  Unfortunately,
the way it's implemented in TCP, TCP isn't keeping score.  Properly, the
ACK delay timer should be adaptive, and adjusted so that, say, 80-90%
of delayed ACKs do in fact get piggybacked, rather than timing out.
But in the current standard, it's a fixed timer, typically 200ms,
which reflects a human scale of response time.  This is totally
wrong for transaction protocols.

     Unfortunately, it's not one of those things you can fix at your end;
it's the ACK timer at the other end that limits local sending performance.
If you could fix both ends, though, it would work nicely.

     					John Nagle