CS244 ’17: Adaptive Congestion Control for Unpredictable Cellular Networks


by Zhiwen Zhang (zhiwen@stanford.edu) and Derin Dutz (dddutz@stanford.edu)

1. INTRODUCTION

1.1 GOALS

In Adaptive Congestion Control for Unpredictable Cellular Networks, Zaki et al. present Verus, an end-to-end congestion control protocol that uses delay measurements to react quickly to rapidly changing cellular networks.

1.2 MOTIVATION

Mobile networking keeps becoming more and more important. Almost half a billion mobile devices and connections were added in 2016 and mobile data traffic has grown 18-fold in the last 5 years [1]. However, this growing amount of cellular traffic relies on protocols that were not designed with mobility in mind. Due to the high capacity variability, self-inflicted queuing delays, stochastic packet losses, large bandwidth-delay products, and general unpredictability inherent to cellular networks [2], many TCP protocols are not performing well in this mobile era. This leads to high end-to-end delays and a poor quality of experience for mobile users.

1.3 RESULTS

Verus’ key insight is to continuously maintain a delay profile that represents the relationship between window size and packet delay. Verus uses this delay profile to adjust the current window size. Written in 2015, the paper claims that Verus outperforms Sprout and various TCP flavors in cellular channels. As can be seen in figures 8 and 9 of the original paper (Figure 1), Verus has significantly lower delay when compared to TCP Cubic and TCP Vegas and tends to have higher throughput compared to Sprout. Also, the R parameter in Verus can be changed to trade off between throughput and delay.

fig1

Figure 1: Figures 8 and 9 from Adaptive Congestion Control for Unpredictable Cellular Networks

2. REPRODUCTION RESULTS

2.1 SUBSET GOAL

The high level goal of our project is to investigate the performance aspects of how Verus holds up against several cellular patterns and also how Verus compares to TCP Cubic in these setups. According to the original work, specifically as described in figure 10 of the paper (Figure 2), Verus with certain model parameters (R ratios) yields an order of magnitude lower delay compared to TCP Cubic and TCP New Reno. The original paper includes TCP New Reno in the comparison, but as can be seen in the figure below, TCP New Reno is almost identical to TCP Cubic in terms of throughput and delay and typically Cubic is slightly better, so we decided to focus on comparing Verus (R = 2, 4, and 6) with TCP Cubic. Our goal is to initially perform the comparison on the same traces that the authors used to generate the graphs below, but then extend the reproduction to more recent and diverse traces.

fig2.png

Figure 2: Figure 10 from Adaptive Congestion Control for Unpredictable Cellular Networks

2.2 SUBSET MOTIVATION

We’re fascinated by mobile systems and are curious if Verus’ claims hold up across varied movement patterns. We thought it would be especially interesting to also test Verus on novel flows. This would not only provide more real-world cellular data to test Verus on, but would also show how Verus fares using more modern cellular networks and in more varied environments. We hope to replicate the results of Figure 10 in the original paper by using the original flows that were used to generate the scatter plot so we can see if our baseline is the same as theirs. Then, we hope to generate our own custom flows that we can use to see if Verus still maintains an edge.

2.3 SUBSET RESULTS

We were able to successfully replicate the papers’ results. We emailed the authors of the paper and were able to acquire some of the original traces that were used to generate Figure 10. Despite using Mahimahi instead of the OPNET simulator that was originally used and a custom RED shared queue implementation, our results were very similar to the ones in the original paper. The results are not exactly the same as there is some fluctuation each time the experiment is run, but they are quite close––leading us to the conclusion that Figure 10 is indeed accurately replicable. Here are the results:

Campus pedestrian

Protocol Throughput (Mbps) Delay (ms)
Verus (R=2) 1.36 181
Verus (R=4) 1.66 273
Verus (R=6) 1.87 389
TCP Cubic 1.90 655

fig3.png

Slow driving within the city with signals

Protocol Throughput (Mbps) Delay (ms)
Verus (R=2) 4.46 89
Verus (R=4) 4.59 205
Verus (R=6) 4.63 324
TCP Cubic 4.66 347

fig4

Fast driving on highway

Protocol Throughput (Mbps) Delay (ms)
Verus (R=2) 3.90 86
Verus (R=4) 3.93 220
Verus (R=6) 3.96 366
TCP Cubic 3.97 335

fig5.png

2.4 EXTENSIONS

We really wanted to test Verus on novel flows to see how it fares in other environments and using more recent cellular networks. As per Keith Winstein’s recommendation, we tested Verus on AT&T and T-Mobile LTE traces from 2016 as well as on Verizon EVDO:

ATT-LTE-driving-2016.up

Protocol Throughput (Mbps) Delay (ms)
Verus (R=2) 1.05 1741
Verus (R=4) 1.61 1967
Verus (R=6) 1.82 2107
TCP Cubic 1.88 2648


Verizon-EVDO-driving.up

Protocol Throughput (Mbps) Delay (ms)
Verus (R=2) 0.70 453
Verus (R=4) 0.83 741
Verus (R=6) 0.87 900
TCP Cubic 0.88 2488

TMobile-LTE-driving.up

Protocol Throughput (Mbps) Delay (ms)
Verus (R=2) 6.86 453
Verus (R=4) 9.08 610
Verus (R=6) 9.33 801
TCP Cubic 8.85 277


As can be seen from the 2016 AT&T trace and the Verizon EVDO trace, Verus performed better than Cubic. However, Verus interestingly had a significantly higher delay for the T-Mobile uplink trace.
Therefore, we also ran the experiment for the T-Mobile downlink trace and noticed that Verus with R=6 has almost 3 times higher delay than TCP Cubic. The T-Mobile traces are more regular and predictable, so our hypothesis is that Verus is better for traces that are more complicated, with more ups and downs in throughput.

2.5 CHALLENGES

Acquiring and converting the original traces

In order to generate as accurate of a reproduction as possible and to have a baseline, we knew we had to acquire the original traces the authors used in their paper. The traces weren’t available anywhere public, so we emailed all of the paper’s authors and asked for them. The traces we received were not in a format compatible with Mahimahi, so we wrote a script to convert them to an inter-arrival time format.

Dependency hunting

When we first downloaded Mahimahi and Verus onto our virtual machine, nothing worked. There were a lot of dependencies that were not clearly specified, so it was quite a pain to figure out what was needed to get everything running. The list of dependencies we ultimately needed to add is quite long: build-essential, autoconf, libasio-dev, libalglib-dev, libboost-system-dev, libprotobuf-dev, protobuf-compiler, libtinfo-dev, libtool, apache2-dev, libxcb-present-dev, libcogl-pango-dev, libtbb-dev, apache2, and gnuplot-x11. We have also clearly listed them in our GitHub repository README.md file so that future people who want to reproduce the research hopefully don’t need to go on a dependency hunt.

Getting TCP Cubic to work

The original paper compares Verus with R= 2, 4, and 6 to TCP Cubic. Since TCP Cubic wasn’t part of the GitHub repository associated with the paper [3], we had to find a way to include it ourselves. We tried several ways to integrate TCP Cubic such as linux modprobe and eventually decided to use the python socket library [4] and write some custom code to integrate it with Mahimahi. The server and client python code for TCP cubic can be found in the cubic subfolder in our GitHub repository.

2.6 CRITIQUE

We started off fascinated by mobile systems and were curious if Verus’ claims would hold up across varied movement patterns. Our reproduction was indeed successful and the paper’s figure 10 does indeed seem to be accurate. Also, the papers’ claims do seem to hold up for some of the more recent traces we tested on so in that sense they do indeed hold up across various movement patterns. However, we did find that Verus had higher delay when compared to TCP Cubic for the T-Mobile LTE traces. We hypothesized that the reason for this discrepancy is due to the T-Mobile traces having less variation. As such, Verus’ success may be dependent on the general unpredictability of cellular networks and that if the flows are more predictable (such as the T-Mobile ones), other protocols are superior. Perhaps 10 years from now mobile networks will become more predictable and a system such as Verus whose success is contingent upon some level of unpredictability may not be appropriate.

3. REPRODUCTION INSTRUCTIONS

3.1 PLATFORM

We decided to use Mahimahi as the platform for three main reasons: 1. It was a platform we were familiar with that was co-developed by Keith Winstein, so we knew we could go to him if we had any problems or issues. 2. The results of the paper should be network emulator agnostic, so it shouldn’t matter that we used Mahimahi. 3. If we were able to successfully reproduce the research using Mahimahi (which we were), having used a different network emulator would make us even more confident in the authors’ original results.

Our entire setup is easily reproducible. We provide a virtual machine that can be downloaded and where our experiments can be re-run.

3.2 REPRODUCTION README

The data used in all of the  tables shown above is generated and stored in a data.txt file, which is a 3-column table containing the sub-experiment name, throughput (Mpbs) and delay (ms), respectively. Note that the results may have some fluctuations between each run, so if you get wildly differing results we would recommend re-running and averaging. The following instructions will run our experiments and generate the reproduction data:

  1. Install VirtualBox.
  2. Download our virtual machine: https://drive.google.com/open?id=0B0AeYYxhAECkYWZnY1kxRnpmMlU
  3. Open VirtualBox and go to File → Import Appliance and import our virtual machine.
  4. Start our virtual machine and log in with the password “verus”.
    • If you cannot start the virtual machine due to a USB 2.0 controller not found error, go to Settings → Ports → USB, and deselect “Enable USB Controller”.
  5. Run experiments (takes approximately 40 minutes):
    1. $ cd cs244-verus
    2. $ sudo sysctl -w net.ipv4.ip_forward=1
    3. $ ./clean.sh
    4. $ python run_experiment.py (You should see two plots appear, one for queueing delay and one for throughput)
    5. The generated data.txt file can be found in the cs244-verus directory
  6. After the experiment is over if you would like to run the experiment anew please run ./clean.sh to reset everything.
  7. If you would like to try running any of the experiments with different parameters, edit values (such as VERUS_R) in verus-x/src/verus.hpp, then re-compile verus as per verus/README.md.

If you are curious about our source code, check out our GitHub repository.

4. REFERENCES

[1] Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update, 2016–2021 White Paper

[2] Adaptive Congestion Control for Unpredictable Cellular Networks, Zaki et al.

[3] https://github.com/yzaki/verus

[4] https://github.com/Gasparila/TCPTuner

Advertisements

2 responses to “CS244 ’17: Adaptive Congestion Control for Unpredictable Cellular Networks

  1. Very interesting analysis! We liked your choice of topic since it is very relevant today and will continue to become ever more so. Reproducing your results was quite straightforward, and the results agreed with those you present in the tables above. 5/5

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s