CS244’15: Denial Of Service using TCP’s RTO.


Conor Eby (conoreby’AT’stanford.edu) and Alfred Xue (axue’AT’stanford.edu)

Introduction:

Denial of Service attack (DoS) attacks a victim by preventing a flow of communication necessary for a legitimate process to function. There are many ways this can be done. The most well-known way is to simply flood a victim with packets, so that it is unable to process the packets of the legitimate process, or at least delay the processing of those packets so significantly that the victim is effectively unable to communicate. Other methods include impersonating a DNS server such that the victim is unable to determine the location of the process it wants to communicate with.

Low-Rate TCP-Targeted Denial of Service Attacks (The Shrew vs. the Mice and Elephants) is a paper that describes a denial of service attack that takes a different approach. The approach the authors describe involves exploiting the RTO feature/nature of TCP. RTO is implemented as a method to reduce congestion on highly congested links. Unlike standard AIMD, if a TCP flow decides that it has reached a threshold of too many unacknowledged packets, it will actually back off and stop sending packets altogether for a period of time that increases exponentially. The low-rate TCP-targeted dos attack operates by flooding the packet for a sufficient period of time to cause RTO to occur, and then backing off as well, and then flooding the system again once the RTO period has expired. This will cause the flows to constantly be in a state of RTO. Also, precision in timing isn’t necessary for this system to work – even if the DOS attack is late, it will still cause a RTO once it hits. If it is early, on the other hand, the TCP flow will be able to run until the next attack. The key observation of the original authors was that there were two null points at RTO/2 and RTO where the attack effectively caused 0 throughput, because it would always be flooding the servers whenever the user attempted to communicate. This result is particularly significant because it is much more difficult to detect a shrew attack then a standard DoS attack. If this vulnerability is observed in practice, then better DoS protection must be developed or changes to TCP made.

The attack operates by sending flow at a maximum possible rate for the period of time necessary to fill the buffer of the attacked link, and then to continue sending flow at capacity to ensure that the link remains at capacity. The paper goes on to discuss mechanisms of detecting this attack, but our simulation will focus on the attack itself.

Subset Goal:

In the paper, the authors run simulations of a DoS attack with resting periods between 0 and 5 seconds, with burst periods between 30 and 90 ms. Their results are found below.

Screen Shot 2015-05-30 at 3.54.51 AM

We attempt to replicate these results on the Amazon AWS servers using mininet. We think these results are particularly important because they describe the possibility of a vulnerability in TCP. The remainder of the paper focuses on methods of reducing the impact of this vulnerability, but it is important to verify its existence before focusing effort on reducing it.

We decided to reproduce the attack with burst intervals of 30ms, 50ms, 70ms, and 90ms as in the paper. However, due to constraints of our system, we were unable to immediately produce results for New Reno, Tahoe, and Sack (using sysctl provided little variation in our results). Given that the original N2 code of the paper is available online, we felt that our time would be better spent exploring the effectiveness of this targeted Denial of Service attack on modern TCP algorithms, such as Cubic and Vegas. We have included Reno to verify the results of the paper.

Results:

015

Screen Shot 2015-05-30 at 4.05.33 AM

The y axis describes the throughput observed by bwm-ng, and the x-axis describes the length of time between attacks.

We found that our results matched up well with the results of the paper for the Reno flow. Although the difference in the Reno algorithm in linux meant that we were unable to create the graph exactly, we were able to get null points and the same general trend as in the paper. Due to the many differences in environment, the graphs do not replicate the graphs in the paper very closely, even for the ‘Reno’ flow. Our graphs, however, do take the general shape of the graphs found in the original paper, with throughput reaching around 0.8% with five second intervals. More importantly, the key result of the paper, that there are two null throughput points when the wait period is set to RTO/2 and RTO is clearly demonstrated by our results.

Challenges:

Most of our struggles came with our version of TCP’s RTO not behaving as expected. The initial settings of TCP created an RTO that exceeded 1 second, which is why our initial tests did not reproduce the null values that were critical to the results of the paper. By manually changing the RTO we were able to create these null points in the graph. Another challenge was that the different TCP congestion control algorithms in linux do not contain the same ones tested in the paper. Even the linux implementation of Reno is not the same as in the paper, and is instead a mix of all of the TCP congestion control algorithms that were tested in the paper. Certain key parts of the simulation were not mentioned in detail in the paper, and had to be gleaned from powerpoint slides about the paper and the original simulation code.

Critique: 

Our results were very similar to the author’s results, and we feel that the results do support the author’s claims. However, we feel that the issue explored in this paper, while an interesting theoretical problem, occurs very little in the real world. The primary reason for this is that this attack is dependent on both knowing the RTO, the bottle-neck server and the existence of a long TCP flow. The results of the paper indicate that if the time interval is not set to RTO or RTO/2, the user is still able to achieve some margin of throughput. Furthermore, most TCP flows are short in practice, and independent TCP flows aren’t affected by past flows entering RTO.

Extensions:

As mentioned in the results section, we extended the evaluation of a shrew attack to modern TCP protocols including cubic and vegas.One key result from the paper is that Reno is particularly fragile when compared to other types of congestion control. Our data shows that this is the case, since the Reno flow is blocked out completely by the dos attack more than any of the other protocols.  However, we note that even modern protocols suffer from this attack. We can see this clearly in the 90ms graph, where all 3 congestion types are shown to be blocked completely by a Dos attack with some period. Interesting additional findings are show by evaluating TCP Vegas under this attack. Since this shrew attack is based off of the time out intervals, TCP Vegas is particularly interesting to study since, unlike Reno or cubic, instead of relying on packet drops to regulate congestion, Vegas attempts to anticipate congestion by measuring the change in round trip times. This means that, as we can see, TCP Vegas is often affected earlier than the other types of congestion control, since it backs off before there is a drop. In the 50ms burst graph, we can see that this allows it to properly regulate its congestion, and not be blocked out by any period of the DOS attack.

 Platform: 

We elected to use Mininet to replicate the results for two primary reasons. The first is that we are most familiar with Mininet, and felt most confident working with it. The second is that since the original authors used NC-2, using a different network simulator would verify that the results aren’t an artifact of some strange feature of the simulator.

Replication:

Instructions for replication on an Amazon AWS can be found in our github repo, which is located at

https://github.com/Conoreby/CS244-shrew-attack.git.

If one wants to replicate the results locally, they should

1. Clone the repo onto their local machine (we still recommend using a VM dedicated for mininet).

2. Install mininet — instructions can be found here.

3. Install bandwidth monitor NG — sudo apt-get install bwn-ng.

4. cd into the repo and run the script with admin privileges.

Advertisements

One response to “CS244’15: Denial Of Service using TCP’s RTO.

  1. Peer Evaluation from Angela Gong and Alice Yeh
    Evaluation: 5/5

    It’s great that you guys performed a modern take on the extension by including TCP Vegas and Cubic and talked about how TCP Vegas would exhibit an earlier drop-off compared to the other congestion control algorithms. Given the focus of the paper is on the back-off portion of the plots, it would have been a bit easier, however, to compare among the different lines if you were able to zoom in on the initial 0.5-2 seconds portion.

    Sensitivity analysis:
    The first plot given, with burst = 0.15sec for TCP Reno only, shows that there’s an initial back-off around the ~1sec mark. The graph has roughly the same shape as the other burst = 0.03/0.05/0.07/0.09 plots. That would suggest that TCP Reno is not sensitive to the burst duration.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s