Michael Gummelt and Graham Roth
TCP is a protocol that assumes cooperation from both the client and the server — there’s an inherent requirement that each party trusts the other to do its job. This trust enables effective congestion control from each endpoint as both sender and receiver work to limit traffic to level that is fair for each flow. As with most trust, however, this assumption that each end is performing correctly can be exploited by a malignant user for his own purposes. Congestion effects have been studied from modified servers aggressively sending traffic, but in most real-world situations these hosts are private servers that are tightly regulated. A more interesting case is to look at the receivers, which are usually home machines accessing the Internet under the use of a relatively naive user and are more easily compromised.
Savage et al. demonstrated three techniques by which a modified receiver could cause a sender to send more traffic than it should and much faster, quickly ramping up the congestion in the network. Sherwood et al. then took one of these attacks, the optimistic-acknowledgement (opt-ack) attack, and showed that it could be used as a denial of service attack and possible create widespread sustained congestion collapse in the Internet. Optimistically ACKing packets, i.e. sending an ACK for a packet before you actually receive it, can be used benignly to increase throughput by lowering the effective RTT of packets. However, this technique, when used improperly, can easily inflate a flow’s congestion window so high that it takes all of the bandwidth of the link.
This topic is really interesting because these are real attacks that can still have real effects. TCP bases its algorithms on the idea that everyone cooperates, so it can easily be taken advantage of. And while the attacks presented directly in these papers will probably be shut down pretty fast in a real world situation when a provider notices the way the congestion window is not following a normal pattern, opt-ack attacks could definitely be the basis of much more subtle and far more dangerous attacks.
Results from the paper
In , Savage et al. demonstrate the opt-ack attack functioning by showing the progression in sequence number space of the received packets for a normal file download and then a file downloaded with opt-acks (below). We wanted to create a plot similar in spirit, if not identical to this one to show that our attack does what it is supposed to.
Sherwood et al. show the behavior of their attack through one graph and a series of details from that graph (below). Our intent was to explore the opt-ack attack and its properties by recreating this graph and then tuning parameters or trying new strategies to see how we could manipulate the data. In particular, we wanted to look at overruns (where the attacker gets ahead of the victim’s sequence numbers) and recovering from them.
We think these are important steps in verifying first that the attack still works on a modern Linux kernel with a modern implementation of TCP, which isn’t necessarily true. Though Sherwood’s patch to the kernel was never accepted, that doesn’t mean that this vulnerability has never been addressed since then.
Beyond that, we wanted to better understand the way the attack works so we could figure out whether there was a more powerful, more subtle, or more interesting way of carrying out the attack. In order to do this, we needed to be able to duplicate the behavior found in the paper as that is the core of any opt-ack based attack. From there, we could try and tune parameters to see what produced interesting results.
We managed to obtain a copy of the code that Sherwood et al. used in  through Connor Gilbert, a student in CS244 last year, and obtained permission from Rob Sherwood to use the code in this experiment. The attack is written in C and uses raw TCP sockets to send and receive packets, bypassing the kernel implementation of TCP. Gilbert and his project partner, Kevin Smith, made some changes to the code so that it would function on modern versions of Linux and we found that some more slight modifications were necessary.
The code has implementations of nine different attacks, and though Sherwood wasn’t entirely sure which attack he had used to get the results in the paper, he offered a suggestion. This attack, attack #7, seems to be an implementation of what the paper calls a lazy opt-ack attack. Though this attack shows some of the basic functionality of an opt-ack attack, it lacks some of the more interesting and aggressive features, so we looked through the other implemented attacks to find one that would exemplify those behaviors. Not finding one, we decided to write a naive implementation of one from scratch, based on the pseudocode in the paper.
This attack sends ACKs continuously and each successive ACK has a sequence number 1 maximum-segment-size greater than the one before it. This assumes that the victim is sending full packets, but given that we are downloading a large file, this is a reasonable assumption to make. If at any point the attacker overruns the victim and is ACKing packets that haven’t been sent yet, the victim sends an empty packet with the correct sequence number. Our attack receives this, adjusts its ACK sequence number, and continues sending packets from there.
We tested both of these attacks on a simple topology of two hosts connected by an intermediate switch in Mininet. We construct the topology such that there is a bottleneck link between the attacker and the switch with a bandwidth of 2Mb/s and a maximum output queue size at the switch of 20 packets. The link between the switch and the victim has a bandwidth of 1Gb/s with no maximum queue size. Both links have 10ms latency.
Once we got the first attack working, we started up Mininet and compared using the attack and wget to download a 10MB file across our topology. Instead of showing the comparison of sequence numbers over time, we decided that an equally interesting result would be to look at the congestion window over time. With wget, and normal TCP congestion control, we would expect to see a classic sawtooth shape as the sender modulates the window (below, left). With opt-ack, however, the receiver (attacker) never sends back any indication that a packet has been lost or mistransmitted somehow, and only ever ACKs packets. This leads to an ever-increasing congestion window, and along with it, exponentially faster file transfer. (below right)
We then attempted to replicate Figure 2 from  by running the same opt-ack attack and recording the sequence numbers being sent by the victim and the ack numbers sent by the attacker. With the attack we were given (lazy opt-ack), we ended up with a tightly-coupled graph of sequence numbers (below), but because the lazy opt-ack doesn’t increment the ack number in the additional acks it sends fast enough, there’s no way for an overrun to ever occur. Therefore, this graph looks like the smooth parts of the graph we were trying to replicate, but doesn’t have any of the interesting features.
In an attempt to investigate overruns in the opt-ack attack, we wrote the naive implementation of the paper’s pseudocode described above. However, the behaviors we noticed don’t seem to make sense with our understanding of how TCP Reno works. The attacker quickly overruns the victim’s sequence space, but the victim continues transmitting packets well beyond the capacity of a slow-start window size. The two keep pace for a while, until at some point, the victim stops transmitting new packets. This is when recovery kicks in, and the attacker resets its sequence space to that of the victim. This works briefly, but it then soon overruns the victim again, which once more stops transmitting new packets.
We’ve successfully demonstrated that at the very least the lazy opt-ack attack works. It greatly speeds up the rate of data transfer by opening the congestion window continuously and never letting it close. However this is the slowest of the opt-ack attacks. One of the big goals of these types of attacks is ACKing packets that haven’t been seen yet, but have been sent. We tried implementing a faster version of the attack to see if we could coerce a faster transfer, but ran into unexpected behaviors on the part of the victim, which defied our attempts to understand them. As a result, we were unable to fully investigate the overrun behavior, as well as non-lazy opt-ack attacks. This means that overall, for our setup, we find that we cannot reproduce the results of the paper.
We faced a number of challenges in getting this experiment up and running. First, getting the attack code functioning was difficult. The code we received came with 9 different attacks and minimal documentation or guidance. In order to understand what each attack did, we had to read through the code line by line. Some of the attacks seemed to be implemented in ways completely unhelpful to the goals of the paper, and some, though we thought we understood what they did, behaved in ways we didn’t expect. When we finally decided to write our own implementation of the pseudocode given in the paper, we found that it was incomplete in a number of ways (e.g. no indication of where to sleep the process).
Once we got the first attack up and running, we ran into a few more issues. There were two bugs with the checksumming code — one where a primitive was being used in a way inconsistent with its actual size, and one where the IP of the sending interface was incorrect) which meant that the packets we were sending had incorrect checksums and so were being ignored. Once packets started being sent, we noticed that RST packets were immediately being sent. This, it turns out is because even though we’re using our own code to parse the TCP packets, they also get sent to the Linux kernel, which doesn’t recognize the flow and sends a RST. We worked around this by setting up a rule within iptables to have the kernel not send packets to the port we were using.
Lastly, in measuring the data, we intended to use the Linux module tcp_probe, so we could automate data collection and use. However, tcpprobe would only show one side of the connection and so it only had half of the relevant data. The reasons behind this are still unclear. As a result, we had to switch to using Wireshark, which breaks the automation and makes data analysis much more tedious.
We developed our experiment with Mininet simulating our topology, and the attacks running on top of that. Mininet is very flexible in the topologies it allows, and makes it easy to run arbitrary network processes between simulated hosts. We used PCAP, Wireshark, and tcp_probe to capture packets for inspection. Experiments were run on Ubuntu 12.10 running the Linux 3.5.0-24-generic kernel.
All parts of the experiment should be reproducible, and as long as you are using a modern Linux kernel, everything should work correctly. We did find some interesting and unexplainable behaviors though — for example, when run by itself, the lazy opt-ack file download shows a congestion-window running up to about 800, but when the same attack is run through a script, the cwnd only reaches 180 (still well above wget, but puzzling nonetheless). We believe these may affect reproducibility, but not in a way that would change the general results we observed.
 Scott Shenker. Making greed work in networks: A game-theoretic analysis of switch service disciplines. In SIGCOMM ’94, pages 47–57, August 1994.
 Sally Floyd and Kevin Fall. Promoting the use of end-to-end congestion control in the Internet. IEEE/ACM Transactions on Networking, August 1999
 Reza Rejaie, Mark Handley, and Deborah Estrin. RAP: An end-to-end rate-based congestion control mechanism for realtime streams in the Internet. In INFOCOM ’99, March 1999.
 S. Savage, N. Cardwell, D. Wetherall, and T. Anderson. TCP Congestion Control with a Misbehaving Receiver. Computer Communication Review, 29(5), 1999.
 R. Sherwood, B. Bhattacharjee, and R. Braud. Misbehaving TCP Receivers can Cause Internet-wide Congestion Collapse. 12th ACM Conference on Computer and Communications Security (CCS), 2005