CS244 ’17: TCP Congestion Control with a Misbehaving Receiver


By Matthew Denton and Pavan Mehrotra

Goals

Savage et. al. “TCP Congestion Control with a Misbehaving Receiver” describes three vulnerabilities in TCP that can be exploited to force faster segment sending than would normally be permitted, and proposes solutions to the attacks. All of the attacks artificially inflate the window size during sending. The first vulnerability involves splitting an ACK for a single packet into multiple ACKs in order to inflate the congestion window, as each ACK is interpreted as pertaining to a separate packet. Specifically, the vulnerability stems from the discrepancy between byte-granularity ACKing and packet-granularity congestion control. The second vulnerability involves sending duplicate ACK segments in response to a single packet, causing the TCP congestion control algorithm to enter Fast Retransmit mode, in which each ACK is assumed to indicate that data is leaving the network, despite the duplicate ACK numbers. So, each ACK inflates the window and an attacker can inflate the window to an arbitrarily large size by sending a large stream of duplicate ACKs. The third attack described involves sending “optimistic” ACKs in advance of the data actually being sent. Since the ACKs arrive early, data is sent faster than the congestion control algorithm would have normally allowed. These attacks are represented in the figures below, taken from the paper [1]. The figure numbers indicate the attacks in the order presented.

Screen Shot 2017-06-01 at 8.34.24 PM

Screen Shot 2017-06-01 at 8.34.35 PM

The paper provides potential solutions to each of these attacks. For the first attack, it proposes counting by bytes instead of packets or altering TCP implementations to verify that an ACK covers a full data segment before expanding the congestion window. The remedy for the second attack is to add a nonce to the TCP header, which is subsequently used to match the ACK to a specific sent packet. The attack can be mitigated without modification to the TCP header by maintaining a counter for the number of duplicate ACKs received and ignoring duplicates after a certain window size. The proposed solution to optimistic ACKs is to use a cumulative nonce, where nonces in packets are added together and maintained on both sides, and is sent back in the ACK. Out of order segments are dealt with by echoing the nonce received. Furthermore, an RST to an ACK for a packet that has not been received can help mitigate the attack.

Motivation

The ability to gain an advantage over a TCP congestion control algorithm is important in order to maintain fairness on the Internet. Furthermore, the fixes would prevent attackers from launching denial of service attacks, where inflated windows can clog bottleneck links. Since the given attacks stem from RFC specifications, it is even more important to fix these issues in order to have a more robust Internet infrastructure.

Results

The authors were able to demonstrate all three vulnerabilities when making the request from a modified version of the Linux 2.2.10 TCP stack to retrieve the index.html page from http://www.cnn.com, with an RTT around 70ms. Specifically, they were able to show substantially faster download rates for the page using all three attacks. The graphs generated are reproduced below [1]. Figure 4 corresponds to attack 1, Figure 5 to attack 2, and Figure 6 to attack 3. The graphs are fairly hard to read, since the symbols used are quite difficult to distinguish, but the most important difference is the the near vertical set of points verses the set that looks exponential. These simply show faster download speeds for TCP Daytona.

Screen Shot 2017-06-01 at 8.37.00 PM

Screen Shot 2017-06-01 at 8.37.09 PM

Our Goals and Motivation

Our original goal was to implement figures 4, 5, and 6 in an old, vulnerable kernel. However, we quickly realized this would not be possible since modern VMs do not run old OS easily, and there is no documentation for the systems. We then thought we would replicate the spirit of the paper, which was to find if these three attacks are possible in a modern implementation of an IP/TCP stack. We felt this would both be more informative and more relevant than simply showing an attack works on a platform where it is specifically designed to work. Therefore, we did not attempt to reintroduce the attacks in a user or kernel level implementation, but rather attempted the attacks in a modern stack. Since the paper was written almost 20 years ago, we thought the most interesting experiment would be one where the vulnerabilities were a side-effect of the RFC or implementation. Therefore, we attempted to test it on a modern user level stack, MTCP, but when we could not get this to work properly, tested it on a modern Linux kernel stack with two different congestion control algorithms: the original Reno, and the newer Cubic.

Our Results

We set up a Mininet topology with two hosts connected to one switch. The links are each given a delay of 32ms and a link speed of 100Mbps, as in the original paper, with a packet queue of 128 packets on each link.

We set the Linux congestion control algorithm per experiment.

On the server, we run a simple Python web server, serving a single HTML web page of size ~175kB from a major new source.

On the client, we run the attackers (the “misbehaving receivers,” written in Scapy) one by one, as well as a normal receiver written in Scapy.

Below, we plotted the sequence numbers of the data received for each attacker, compared to a normal TCP receiver. The original paper’s graphs also include the ACK numbers of the receiver’s ACKs, but we exclude these as they provide no useful information and only serve to complicate the graph.

Screen Shot 2017-06-03 at 2.40.37 PM

Screen Shot 2017-06-03 at 2.40.14 PM

As the graphs show, the Split ACKs and Duplicate ACKs no longer work on Linux, even with TCP Reno. However, Optimistic ACKs still allows a receiver to manipulate the server into sending packets much faster than is normal.

We believe that the Split ACKs attack is slower than normal TCP because the split ACKs clog up the ACKing pipeline and delay the reception of the full ACK, causing the window to slide slower.

We believe the Duplicate ACK graphs include an extra line on the bottom due to retransmission, and interestingly, the transmissions initially go faster for both Reno and Cubic, but level off and finish much more slowly. This is unexpected–we are not sure if the attack succeeds or not. Our hypothesis is that the link become saturated with ACKs and we begin to drop the extra duplicate ACKs, and the attack stops working. However, it may just be a strange side effect of the fast recovery mechanism. In any case, we believe in the modern Internet, with modern Internet bandwidths, that sending enough duplicate ACKs to inflate the window may be difficult, as the congestion window is halved in Reno and we have to make up this deficit with duplicate ACKs before even inflating the congestion window to higher levels.

Finally, the Optimistic ACK graphs show that the attack works on both Reno and Cubic, as the web page downloads finish much quicker using the Optimistic ACK attacker than the normal TCP receiver.

Challenges

Our first idea was to try to obtain an old copy of Linux/BSD that was vulnerable to the attacks (any OS listed in Table 1 of the paper) [1]. After hours of searching and attempting to run them in multiple different VM programs (Parallels, VirtualBox, and VMware Player), we abandoned this idea for three reasons. First, most VMs would not boot the old versions of Linux and BSD we found, which we thought was fairly ironic since this is one of the reasons for the existence of VMs. The second was that we realized Mininet and Scapy would most likely not run on this platform, as Python was still in its infancy at the time. This would mean we would need to run the experiment in a physical topology instead of a virtual one. Lastly, it was difficult to find any documentation on these old systems, which would have made implementing any sort of system very difficult.

We then tried to use mTCP, a scalable user-level network stack for multicore systems [2].  Kernel-bypass TCP stacks are becoming very popular for cloud services and CDNs due to their flexibility and high performance, and therefore demonstrating the vulnerabilities in a very new TCP stack would show that the weakness of the RFCs could continue to lead to vulnerable implementations, which would be a problem given the current proliferation of new TCP stacks. Looking at the mTCP source code, it seemed that it was vulnerable. Unfortunately, we again had difficulty installing the program, due to the complex instructions and many kernel modules needed for kernel bypass. This needs to be made simpler. We tried both PSIO and DPDK installs on supposedly supported kernels and different Linux distributions (RHEL6 and Ubuntu 12.04 and 14.04), but neither of these allowed us to set up an mTCP server. The README itself states only specific CPU/NIC configurations have been tested, so we believe the problem was either rooted in the lack of access to the specific hardware on which it was know to work, or the fact that the hardware itself was virtual. We were also forced to use a local VM instead of a Google Cloud instance, since it replaced the kernel stack with a module that prevented SSH from working. Therefore, we had to abandon mTCP for another network stack.

The paper itself did not describe the full network topology, so we had to guess at a reasonable, semi-realistic implementation to use for Mininet.

Critique

This paper was written in 1999, during the infancy of the Internet, so it follows that the attacks should not be valid today. However, it was interesting to see that the last attack was still possible in a modern kernel TCP implementation. Furthermore, it was interesting to see that the last attack seemed to have been mitigated by accident with the switch to CUBIC, which does not expand its window size only based on ACKs, and not through a deliberate defense.

We believe the reason the first two attacks are not obviously the same is because they have been fixed, as the paper seems to indicate had already happened by its publishing on some systems, but we would have to check the source code to be sure. We originally wanted to try MTCP, which, from reading the source code, looked vulnerable to all the attacks. Unfortunately, we could not get it working, so this has been left to future work.

Also, the paper’s OptACK attack is very rudimentary. It simply sends an ACK for a new maximum sized segment every 10ms. However, this is experimentally tuned to the specific RTT, bottleneck link speed, and toplogy. It remains to be seen whether this can be predicted and taken advantage of automatically. In addition, it remains to be seen whether OptACKs can cause congestion control collapse when sending Optimistic ACKs over the same bottleneck link.

As the paper is very old, none of these attacks should be possible to any degree. It is quite troublesome that we are able to implement any of the three attacks almost 20 years after the attacks were originally discovered. We believe this shows a real problem with TCP as a protocol, due to its lack of focus on security and inflexibility.

Platform

For our final platform, we use Scapy, a Python-based library that uses raw sockets to allow layer 2 through layer 4 packet manipulation, to implement the attacks [3]. We chose this because it was extremely powerful and allowed us to implement the each attack in under 100 lines of code, most of which was common between the three attacks. To create the topology, we used Mininet, a Python-based network emulator that creates a realistic virtual network, since it allows relatively easy topology setup and reduced variability in our topology [4]. Furthermore, it allows us to create hosts easily without spinning up multiple VMs and connecting them through some physical topology. This was a much simpler setup than the original paper, which used a real network. Ours should be more reproducible, as it should mostly depend on scheduler decisions, but we have seen some variations between different runs. We are unsure as to the specific cause, but believe the problem is with using a Google Cloud VM, which has less reliable scheduling guarantees. While we most likely will not receive the exact same results due to the difference between a virtual and physical topology, we felt that this accurately reproduced the spirit of the paper’s attacks, which was potentially faster downloads over TCP with a misbehaving receiver.

README

Provision an n1-standard-2 (2 vCPUs, 7.5 GB memory) Google Cloud instance running Ubuntu 16.04. Allow HTTP and HTTPS traffic.

gcloudimage

SSH in and install git if needed:

sudo apt-get update
sudo apt-get install git

clone repo and cd into the directory:

git clone https://github.com/pavmeh/cs244_TCPDaytona/
cd cs244_TCPDaytona

Run ./setup.sh to install all the dependencies. Press Y if prompted (Note: DO NOT run this with sudo, since this will not install the dependencies correctly). If this breaks, run the commands in the script individually in the terminal. This needs to be run only once per VM instance.

Run ./run_experiments.sh to run the experiment. It will take approximately 1 minute to run. An http server is automatically create that can be used to view the data files created (.png) by using the external IP shown on the Google Cloud Platform Manager for the VM instance. The URL used in your local browser should be of the form:

http:///

The png files available should be named data.reno.png and data.cubic.png.

From here, simply type in the name of the png file you wish to view after the slash, as shown above. Kill the server with ctrl+C when done.

If you wish to run the experiment again, run ./cleanup.sh first. This deletes the previous data files.

Acknowledgements

Scapy code was pieced together from StackOverflow posts. We also directly used some the code from assignment 1, BufferBloat. Specifically, the Mininet code was based on this assignment, while the sever and delivered web-page were taken directly from the assignment.

References

[1] Stefan Savage, Neal Cardwell, David Wetherall, and Thomas Anderson. TCP Congestion Control with a Misbehaving Receiver. ACM SIGCOMM Computer Communications Review, vol. 29, no. 5, October 1999, pages 71 – 78.

[2] EunYoung Jeong, Shinae Woo, Muhammad Jamshed, Haewon Jeong, Sunghwan Ihm, Dongsu Han, and KyoungSoo Park. mTCP: a Highly Scalable User-level TCP Stack for Multicore Systems.USENIX Symposium on Networked Systems Design and Implementation. 2014, pages 489-502.

[3] Scapy, http://www.secdev.org/projects/scapy/

[4] Mininet, http://mininet.org/

Advertisements

One response to “CS244 ’17: TCP Congestion Control with a Misbehaving Receiver

  1. 5 – Code ran to completion and results matched very well with the blog post

    The code printed a few errors when running, but the process didn’t stop and eventually produced the results that match well with the blog post.

    I got the following errors:
    1. *** Error: RTNETLINK answers: No such file or directory
    It seems to be something with Ubuntu 16

    2. IOError: [Errno 2] No such file or directory: ‘dupACK.vegas.npy’
    I guess TCP-vegas is not tested in the end.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s