Security threats pervade the modern Internet. Some, like an elephant, are loud, thundering, and impossible to ignore: for example, a horde of compromised machines launching a DDoS attack with hundreds of Gb/s of traffic. Others, like the tiny shrew, are smaller and much harder to detect, but still potentially destructive.
It is this second class of attacks that is the subject of Kuzmanovic and Knightly’s paper, “Low-Rate TCP-Targeted Denial of Service Attacks” . The authors demonstrate that, by sending a precisely timed square-wave pattern of packet bursts, an attacker can interfere with a TCP flow’s retransmissions on a bottleneck link. As they state, “At times of severe congestion in which multiple losses occur, TCP operates on longer timescales of Retransmission Timeout (RTO).” Using this knowledge, an attacker can send “periodic on-off ‘square-wave’ shrew attacks that consist of short, maliciously-chosen-duration bursts that repeat with a fixed, maliciously chosen, slow-timescale frequency”.
A key result of this paper is that, with precisely-timed attack traffic (i.e., a square wave whose period depends on the target flow’s minRTO value), an adversary can almost completely deny throughput to a TCP Reno flow that shares the same bottleneck link. An example of such an attack pattern is shown in Figure 3 from the paper.
The paper’s result is significant because these attacks are low-rate, making them hard to detect: the length l of the square wave is short (e.g., 150ms) and the magnitude is just the bottleneck link speed, C. Yet they are destructive, essentially denying any meaningful throughput to the target flow sharing the bottleneck link.
The goal of our project is to reproduce Figure 4 from Kuzmanovic and Knightly’s paper, which shows the effect of varying the attack period T (i.e., gap between the start times of subsequent packet bursts) on the throughput of a single target flow:
Motivation: We feel that this is the most important result in the paper, because (as shown above) the choice of the attack period T directly correlates with the effectiveness of the attack. For example, by using T = 0.5 second or T = 1.0 second between bursts (assuming the target uses TCP Reno with minRTO = 1 second), the attacker almost completely denies throughput to the target flow. On the other hand, if the attacker uses an attack period T that is just slightly less than 1 second, the target flow achieves close to 100% throughput (meaning the attack fails). Knowledge of these values might also be relevant to network administrators attempting to detect and prevent low-rate TCP DoS attacks.
Although the authors of “Low-Rate TCP-Targeted Denial of Service Attacks” used the ns2 network simulator, we used Mininet to reproduce their simulation. Much of how we set up our experiment is informed by the following paragraph from the paper:
“Next, we perform a set of ns2 simulations to compare against the model. In these experiments, we again consider the scenario of Figure 2 but with a single TCP flow. The TCP Reno flow has minRTO = 1 second and satisfies conditions (C1) and (C2). More precisely, the propagation delay is 6 ms while the buffer size is set such that the round-trip time may vary from 12 ms to 132 ms. The link capacity is 1.5 Mb/s, while the DoS traffic is a square-wave stream with the peak rate 1.5 Mb/s and burst length 150 ms.”
We configured our network topology as follows in Mininet:
Some of the salient points of our experimental setup are as follows:
- Host alice sends a 2 MB TCP Reno flow to host bob. We use the nc (netcat) tool to send this flow; see run_sender.sh (run on alice) and run_receiver.sh (run on host bob). The minRTO of this flow is set to 1 second.
- Host mallory runs the square-wave attack (run_attacker.py), sending UDP traffic (small UDP packets back-to-back) in bursts with bob‘s IP as the destination.
- By using two separate switches (one shared by the “good” sender and the attacker, and one to which the receiver is connected), we ensure that the attacker and normal flows share the same bottleneck link.
- We measure average throughput as the total number of bits sent (2 MB = 16e6 bits) divided by the total time needed for alice to send the flow.
- The dos.py Python driver sets up the above Mininet topology and runs the normal TCP flow and attacker flows in subprocesses. The run.sh helper script (which must be run as root) repeatedly invokes dos.py, varying the period and burst duration of the attacker, then plots the results.
Our results are shown in the graph below. We ran the attacker with several different square-wave burst lengths; for each of these, we plot the normalized throughput as the period of the attack was between 0.5 and 2.0 seconds:
As shown above, we believe we have successfully reproduced Figure 4 of the original paper. Overall, our results look quite similar to the figure obtained by the authors. When the attacker (mallory) sends the square-wave UDP attack with period T = 0.5 or T = 1.0, the alice → bob TCP flow is almost completely denied throughput. For periods between 0.5 and 1.0, the flow obtains some percentage of the maximum possible bottleneck bandwidth — up to about 40%, as obtained in the paper. For periods longer than 1.0, the flow obtains gradually more throughput (up to 100%).
We also measured the alice → bob flow’s congestion window (cwnd) at the sender, by using tcp_probe while the different attacker’s square-wave periods were running. This provides further evidence that the attack is working by continuously resetting the sender’s congestion window:
As shown above, when the attacker generates the square wave with periods T = 0.5 and T = 1.0 (the yellow and red lines, respectively), the normal TCP flow almost never achieves more than a cwnd of about 5 to 10. The other attack periods are less effective and the maximum cwnd values are significantly larger. They also end earlier since the throughput is higher.
The original paper was fairly clear in its description of the attack, and the parameters needed for it to succeed (e.g. the TCP congestion control algorithm, burst and periods of the square wave, TCP options such as minRTO, etc). The attack is also conceptually simple to understand. This made it relatively straightforward to reproduce the result.
Our main challenges were twofold:
- Properly tuning Mininet link parameters (specifically, the bottleneck link’s RTT and queue size) to reproduce the simulation from the paper. We needed several rounds of experimentation to tune these parameters correctly.
- Ensuring that we had set the right TCP kernel parameters to use Reno with fast retransmit disabled. We found that it helpful (though probably not strictly necessary) to disable TCP’s fast retransmit options, as follows:
sysctl -q -w net.ipv4.tcp_sack=0
sysctl -q -w net.ipv4.tcp_dsack=0
sysctl -q -w net.ipv4.tcp_fack=0
The original paper used the ns2 simulator, and there was no code published, so we had to tweak these settings in order to arrive at a reasonable reproduction of the authors’ experiment. Once we did this, we found the attack to be easy to reproduce in Mininet.
Our experimental setup demonstrated the validity and effectiveness of the “Low-Rate TCP-Targeted Denial of Service Attacks” as proposed by Kuzmanovic and Knightly.
Although our results confirm those in the original paper, we feel that this attack is quite limited in its practical effectiveness. There are several reasons for this:
- The attack relies on very careful timing (though an attacker could, of course, run multiple UDP square waves in parallel; this might increase the likelihood that the attacker is detected by a network administrator).
- The attack requires the attacker and target flow to share the same bottleneck link. This makes it very difficult for an attacker somewhere on the public Internet to target an arbitrary flow.
- The attack depends heavily on the parameters of the network, such as the latency, bandwidth, and queue size of intermediate routers. Queue size is especially important because the UDP flood sent by the attacker must overflow the buffer at the bottleneck router, causing packet drops in the target flow.
Nevertheless, the attack remains interesting theoretically and may inform future designs of congestion control algorithms.
Steps to Reproduce
- Create a fresh Google Compute Engine virtual machine with Ubuntu 16.10, and set firewall rules to be able to access the HTTP server that will be started. We recommend that you use the following gcloud commands to create and connect to the instance. Otherwise, you should perform the corresponding instance creation and firewall setup in theGoogle Compute Engine web UI.
$ gcloud compute instances create --image ubuntu-1610-yakkety-v20170502 --image-project ubuntu-os-cloud --machine-type n1-highcpu-8 --zone us-central1-c --tags http-server,https-server cs244 $ gcloud compute firewall-rules create http --allow tcp:80 $ gcloud compute ssh cs244
- In your GCE instance, execute the following:
curl "https://cs.stanford.edu/people/rpropper/cs244/setup.sh" | /bin/bash
This script will install Python dependencies (e.g. matplotlib), check out and install Mininet. Note: We provide this setup script as a separate step from the VM image for better clarity and transparency. Feel free to inspect the script before running it 🙂
- In the GCE instance, clone our git repository:
git clone https://github.com/hotpxl/low-rate-tcp-targeted-dos-attacks.git
- Now, to run the experiment:
cd low-rate-tcp-targeted-dos-attacks sudo ./run.sh
Please be patient; a run takes between 1 to 2 hours.
- After the script runs, it will show you a URL where you can view the results. There should be two generated .png files in the root directory. One (named results–_rate.png) will show the normalized throughput, i.e., the primary figure we are trying to reproduce. The other shows the sender’s cwnd values, plotted over time, for various attack period values.
- Aleksandar Kuzmanovic and Edward W. Knightly. “Low-Rate TCP-Targeted Denial of Service Attacks (The Shrew vs. the Mice and Elephants)”.