2-Dec-96 7:19:20-GMT,3291;000000000005 Received: (from jaltman@localhost) by watsun.cc.columbia.edu (8.8.3/8.8.3) id CAA18414 for fdc; Mon, 2 Dec 1996 02:19:19 -0500 (EST) Date: Mon, 2 Dec 1996 02:19:19 -0500 (EST) From: Jeffrey Altman Message-Id: <199612020719.CAA18414@watsun.cc.columbia.edu> To: fdc@watsun.cc.columbia.edu Newsgroups: comp.protocols.tcp-ip Path: news.columbia.edu!news.columbia.edu!panix!feed1.news.erols.com!howland.erols.net!news-peer.gsl.net!news.gsl.net!ix.netcom.com!netcom.com!nagle From: nagle@netcom.com (John Nagle) Subject: Re: Nagle - TCP_NODELAY - transaction oriented applications Message-ID: Organization: NETCOM On-line Communication Services (408 261-4700 guest) References: <32a27b97.86620079@philos.philosys.de> <57filf$mf2@noao.edu> Date: Sun, 1 Dec 1996 20:27:15 GMT Lines: 43 Sender: nagle@netcom6.netcom.com Xref: news.columbia.edu comp.protocols.tcp-ip:46498 rstevens@noao.edu (W. Richard Stevens) writes: >> Is an application required to turn off the Nagle algorithm by setting >> TCP_NODELAY in order to get good transaction performance? >No. If the client writes all of its request to the server at once >(e.g., one write() or one writev()) then Nagle shouldn't be a problem. >Where Nagle interacts with a transaction client-server is when the client >sends its request as two write()s, say a small header followed by some >data. In that case the data won't be sent when write() is called the >second time, because of Nagle, and to make it worse, the server will >delay the ACK of the first small packet because there is nothing to send >back (until the server gets the second write). Having invented the thing, I suppose I should say something. The basic problem is that either delayed ACKs or the Nagle algorithm are OK, but the combination of the two is bad. Historically, the Nagle algorithm came first, and it was put in to prevent tinygram floods, typically caused by an application doing lots of 1-byte writes into a slow link. Each of those writes got blown up to a full packet, and then some of them would get dropped due to congestion, and then TCP would start retransmitting, and on slow links, the TCP connection might even time out. In any case, you'd get a huge increase in traffic when this occured, enough to cause congestion collapse in low-bandwidth nets. (This was back in 1983, remember.) Delayed ACKs went in TCP after I was out of networking, and I'm not too happy about how it was done. A delayed ACK is a bet; it's a gamble that something will be sent in the near future on which the ACK can be piggybacked. Sometimes you win, sometimes you lose. Unfortunately, the way it's implemented in TCP, TCP isn't keeping score. Properly, the ACK delay timer should be adaptive, and adjusted so that, say, 80-90% of delayed ACKs do in fact get piggybacked, rather than timing out. But in the current standard, it's a fixed timer, typically 200ms, which reflects a human scale of response time. This is totally wrong for transaction protocols. Unfortunately, it's not one of those things you can fix at your end; it's the ACK timer at the other end that limits local sending performance. If you could fix both ends, though, it would work nicely. John Nagle