Authors: Bryce Taylor (btaylor @ stanford) and Nick Yannacone (nicky1 @ stanford)
Denial of service attacks prevent a target computer from responding quickly, or in some cases responding at all, to its legitimate users’ traffic. These attacks can be quite versatile in their aims — they can prevent a competitor from serving its customers, deny information to people who seek it, or slow down infrastructure. However, they’re also usually easy to detect, since they often require massive bandwidth to drown out legitimate traffic. Kuzmanovic and Knightly’s shrew attack, which we chose to replicate, achieves similar results with a much subtler approach.
The attack alternates between periods of silence and periods of heavy sending to an overcommitted bottleneck link. Floods occur at intervals that synchronize with the minimum retransmission timeout (RTO) of established connections over the bottleneck link, which is constant in most implementations of TCP. These periodic floods trigger TCP timeouts on the legitimate connections, thereby lowering the throughput of each affected connection. However, since the attack sends heavy traffic only in brief bursts, it achieves a low average throughput, making it look like other legitimate but bursty traffic rather than an obvious attack.
Kuzmanovic and Knightly ran a number of experiments in their paper describing Shrew, mostly on a simulated ns-2 network. They found that Shrew works quite well in idealized situations: if it’s competing with a single TCP flow, it drives the throughput of that flow to 0, while if it’s competing with multiple TCP flows of the same RTT, it drives their collective throughput to 10% of normal. In networks with hetergeneous RTTs, Shrew is only really effective against low-RTT flows. When attacking simultaneous HTTP downloads, Shrew does more damage to larger file downloads, but also has more variability over large downloads (some run faster with Shrew than without!) Based on tests run over the wider internet, Shrew runs about as well in practice as in simulation; on average, it brought the throughput of affected flows to 10% of their normal rate. Unsurprisingly, the attack is more effective with longer or higher-rate bursts; however, it is also (presumably) more detectable the more traffic it sends.
For our part, we chose to replicate one of the simplest yet most significant experiments in the paper: a test of Shrew’s performance against a single, low-RTT TCP flow. This experiment is the only situation in which the authors claim that Shrew can actually force a flow’s throughput to 0, rather than merely reducing it. It’s also a proof of concept of Shrew’s effectiveness in general, since if Shrew isn’t effective when it only needs to block a single flow, it’s unlikely to work in the internet, where conditions are much more varied.
In our implementation, the experiment works as follows: a webserver and an attacker are both connected to one end of a 1.5 Mbps bottleneck link, while a client of the webserver and a friend of the attacker are connected to the other. All parties have a propagation delay of 6 ms, and all queues in the network have a capacity of 15 packets. The client attempts to download a file from the server via HTTP on top of TCP Reno. Meanwhile, the attacker periodically sends 150-ms bursts of packets. During these bursts, the attacker sends at 1.5 Mbps; otherwise, the attacker sends nothing. We measure the effective throughput achieved by the client for different periods of the attack.
The corresponding graph in the Shrew paper is:
This graph compares the authors’ mathematical model to their actual experimental results. Even experimentally, they were able to drive the client’s throughput to 0, which is a very good sign for Shrew’s effectiveness as a DOS attack!
Note: We use error bars on figures, but we do not feel that we have a good estimate of the errors unless there are at least 5 (preferably 10) trials for those data points; they are meant as a rough indication only.
We were never successful in recreating the attack using our Mahimahi setup, despite spending the majority of our time working on this part. In each case, with small or large file downloads, the graph looks much more like a graph where throughput is proportionally lost than an actual DoS attack. To reduce script run time, we included a small representative graph to see these results:
Note that there is a small dip around 1000 ms, but this is nothing like what the authors described (in particular, there is no drop around 500ms; the relative throughput of over 0.6 is far too high to indicate the attack happening correctly). The small error bars in particular leads us to believe that the attack is not working. For the sake of completeness, we include a quick run for a larger file download (but with limited trials in the interest of time) showing minimal indication of the attack succeeding.
After giving up on Mahimahi, we were able to rapidly get much more promising results in Mininet. Below is the initial recreation of figure 4, where each data points is the average of 5 runs. Since the authors presented a mean throughput, we also plot the mean throughput, though we also include error bars representing the standard error of the mean; one standard deviation is given for clarity on the graph (so the confidence interval is 68%; double the error bar sizes to get 95% confidence). We truncate the graph to reduce script runtime – the graph is not interesting beyond 3 seconds.
Unlike the paper, in our experiments, the primary sign of the attack is large variability in throughputs (see the 1000-1500ms region). In our work, the uncertainty was so high that we cannot draw any shape for this region with reasonable confidence. In reality, this likely means that the attack sometimes works and hits the correct timing but other times does not (this is particularly relevant as the file was small – only 8Mb). As the error bars indicate, the attack is as effective as the authors say, but only sometimes. The authors do not mention this variance (which we believe is not just due to consistent run-to-run noise), so we suspect their setup has a more consistent way of launching the attack at the same time relative to the start of the download. This data is not strong enough to fully support the claims by the author of the attack reducing throughput to near 0 near the RTO.
We zoom in on the 1000-1500ms region running 10 trials each time with the rest of the setup the same to see if we can get more definition (shown to the right). With this resolution, we see more of the shape of the attack (a modest dip slightly above 1 second), though again, the error is too high to be confident in our results. We believe a large part of the reason that we do not match the experiment is that the file size was too small so the attack had too few chances to work during the download. Unfortunately, we switched to mininet too late to provide good graphs for large file downloads (the script takes several hours to run properly – with base download time of only 30 seconds, the attack can take at least 20 minutes per data point due to the slowdown and repetition required for good results). We provide a small sample to show how much better the results are for a longer file download (40Mbit file):
These graphs give us faith that the attack is actually happening, close to how the authors of the paper described the attack (note how the relative throughput goes below 0.2); though the time between bursts is shifted a little higher than that of the paper.
Throughout this project, there were several distinct points that prevented us from making more progress, mostly due to our inexperience with networks. We list the points below, as well as a brief description of what made the point so troublesome. Since we were attempting to replicate results from another paper, at each point it was particularly difficult to tell if we were stuck because the results were not replicable or because we had missed some key part of their setup.
- Creating square waves – we had to use udp since tcp and other higher-level connections will not naturally create a precise square wave. Using linux command line packet sending, nc and python all proved too difficult to precisely control the form of the wave, so we eventually wrote the wave-generating code in C.
- Setting up Mahimahi topology – due to lack of examples and documentation of how mahimahi works, it took us a very long time to understand what we were actually creating in Mahimahi (with extra difficulty caused by incorrectly chaining commands). At this time it was extremely difficult to determine whether or not we would expect the attack to work or even if the setup resembled the picture we had in our minds.
- Communicating attack through NAT in Mahimahi – Each layer of mahimahi creates a NAT, which obscures ip addresses of the inner layers and makes it difficult to figure out what the topology is actually being created. In particular, tools like traceroute, ping, and ifconfig do not provide the same useful information they do when not surrounded by NATs. We resolved this by creating a friend for the attacker inside the NAT to open the attacking connection (note that this friend is not necessary in Mininet or non-NAT-based topologies).
- Adjusting TCP RTO – Modern implementations of unix do not have a standard minRTO of 1s as was the case when the paper was written. Unfortunately, there is also not a single parameter to control the RTO, so we had to figure out how to modify the RTO per route with ip route.
From our experiments, it seems that the Shrew attack does work in practice (or at least in simulation), though perhaps not as well as Kuzmanovic and Knightly claim. We were able to drive the throughput of a single TCP flow substantially lower than normal using Shrew, albeit not to 0.
However, it’s not clear that this attack would be the best choice for someone hoping to deny service in practice. In the first place, performing the attack successfully requires a moderately detailed understanding of the network, since it relies on knowing (or measuring) the minimum RTO used at a particular bottleneck link, as well as knowing the topology around that bottleneck link. A committed attacker could likely discover both of these easily, but it would present a small hurdle. A bigger problem with the attack is that it doesn’t have as far-reaching an impact as an attacker might hope. The victim server isn’t brought down entirely, just slowed down substantially. This will annoy the victim’s clients, but not prevent them from using the service entirely. Additionally, the attack only really works well on flows which are long-lasting and have low RTTs, so its impact would be localized geographically and wouldn’t impact sites without large files. Finally, the attack works better against TCP Reno (which we and the authors tested on) than against more modern implementations of congestion control.
Since we were not familiar with networking simulations, etc we did most of our work in the Ubuntu VM given for Programming Assignment 1 as we assume that one is set up relatively well. Having a popular operating Unix-based operating system (Ubuntu) is essential as we need to produce low-level waves as well as modify how TCP behaves – the popularity ensures rich documentation of how to perform the low level tasks we need. However, the local virtual machines running in Virtual Box do not have the best / most reliable performance, so we moved to EC2 running a similar version of Ubuntu after we figured out most of the kinks of the project. Additionally, EC2 is more readily accessible than the specific VM used for PA1 in this class. EC2 is particularly nice as we can run multiple tests simultaneously (in different instances) and the tests take a while to complete.
For simulation software, as mentioned in our original Goal, we wished to reproduce the results in a new simulation environment to ensure there wasn’t something specific to mininet / NS2 that enabled the attack to work so well. Thus, we chose mahimahi as our primary simulation software since our topology was simple enough that it seemed likely easy to emulate in Mahimahi (note that this was not the case in reality).
Thoughts on lack of results in Mahimahi
The only difference we are aware of between our setups in Mahimahi and Mininet is that the Mahimahi setup uses a single bottleneck link connected to two parties at each end while the mininet setup uses 5 total links (one bottleneck and then 4 more to connect the parties to the bottleneck). This difference does not seem like it should be notable, but we were unable to construct the multiple links feeding into the bottleneck link in the outermost layer of Mahimahi (e.g. the attacker and server) in Mahimahi, so we could not use precisely the same topology. Another possibility that we explored some but not fully before giving up on Mahimahi is the difference in expressing queue size in bytes and in packets (we tried 22500 bytes and 15 packets but neither seemed to help); this would be particularly relevant since the paper uses 50 byte packets rather than full 1500 byte packets in its attack.
Final Statement / Conclusion
We conclude that there is a substantial difference between how Mahimahi and Mininet / NS2 emulate networks and some part of this difference is critical to making the attack work. Either the paper did not include all relevant details to set up the attack (but Mininet / NS2 happen to have key settings set), Mahimahi emulates differently in some way not apparent in its limited documentation, or we have a bug in our setup for Mahimahi. Having a long file download is also key to making the attack work; short downloads are not as susceptible.
- Launch an Amazon EC2 Ubuntu instance using the instructions from PA2 (on Piazza). Make sure the username is ‘ubuntu’ (the default).
- In a terminal, run
- sudo apt-get update
- sudo apt-get install git
- git clone https://github.com/newtrat/shrew.git
- cd shrew
- chmod 776 run.sh
- sudo ./run.sh [Hit Enter and y when prompted near the start of the script.]
Note that this script will take a very long time (potentially up to 12 hours) to run.
If you don’t want to leave your ssh session running for that long, feel free to detach the
- sudo ./run.sh
- Ctrl-b d
The graphs we’ve used in this blog post are stored in data/name_of_experiment (also copies are stored in timestamped folders corresponding to the time that experiment completed; the pictures inside those folders have meaningful names).
Also, if your EC2 instance seems slow (slow to download dependencies – the tests will take a while to run but print output periodically), create a new on and hope for better load balancing =D