CS244 ’15: Is Flow Completion Time a Better Measure for Congestion Control?


Angela Gong and Alice Yeh

Introduction and Motivation

Networking research has traditionally focused on more system-wide performance metrics such as throughput, link utilization, and fairness. However, none of these really matter to the user, who is more interested in getting their download in as little time as possible. In fact, many transactions on the internet are ones that prefer a shorter flow completion time (FCT), which is the time between the first packet’s departure and the last packet’s arrival for a given flow.

Original Goals of the Paper

As network bandwidth increases, flow completion time should correspondingly decrease. However, this is not the case with the congestion control algorithms in use today. Standard protocols such as a TCP make flows last much longer than necessary because they were designed for long-lived flows, not the mixed short- and mid-lived flows that are typical today. The goal of the original paper [Dukkipati] was to come up with a solution that would minimize FCT, thus providing a speedier user experience.

Original Results

Although minimizing FCT is an intractable problem, the paper comes close by approximating Processor Sharing (PS) and introducing a new congestion control algorithm called the Rate Control Protocol (RCP). PS is a well-known method that nearly minimizes FCT by having each router divide an outgoing link equally between flows. With RCP, the paper analytically and experimentally shows that on average, flow completion time is much smaller for RCP than for TCP and XCP (a newly suggested congestion control algorithm).

Our Experiment

Subset Goals and Motivation

We wanted to recreate Figure 6 from the paper, which compares the average flow completion time using three different protocols (TCP, XCP, RCP), slow-start, and PS. We chose this figure because the plots clearly show that RCP has flow completion times that follow the ideal PS behavior more closely compared to other protocols, which are supposed to approximate PS. That would imply that if routers implemented RCP worldwide, Internet users would see an improved experience.

Subset Results

The following 4 plots are of average flow completion time for different flow sizes with different networking protocols.

main_group_plots Figure 1: Our graphs on a semilog scale (top left) and log-log scale (top right) and original graphs (bottom left and right).

We were able to reproduce the graphs that the original author showed in her paper.

The two differences are:

  • Our RCP plot has a slightly longer AFCT (by roughly ~0.1s) and does not follow the PS line as well as the paper demonstrates.  In the original simulation script provided, we saw that the author had used a calculated mean packet size of 30 using bandwidth-delay instead of 25, which was the number written in the paper.  We tested this hypothesis but unfortunately did not see the desired change.  Other than this slight deviation, our NS-2.35 setup produces the same plot as a NS-2.30 setup, which the author used in her data generation.  We are unsure of this slight discrepancy.  It is possible there is a configuration change between NS-2.30 NS-2.35 since a group from CS244 ’14 also observed the same deviation.
  • We ran into difficulty with reproducing the XCP line, which is discussed later in the Challenges section.

Challenges

We faced several challenges during the course of our project:

  • NS-2.35 setup – It was very difficult setting up NS-2.35 on Ubuntu. We ran into several issues with compilation and found that Ubuntu lacked several packages that NS-2.35 required but included in the original script’s all-in-one archive. We were eventually able to figure it out after careful scrutinizing of the build logs.
  • OTcl programming language – NS-2.35 uses a scripting language called OTcl which we were unfamiliar with. It took us longer than expected to understand the scripts provided by the authors. However, we found useful documentation online from [OTcl] and [OTcl scripting].  (Fun fact: Tcl was created by Professor John Ousterhout!)
  • XCP – Like the CS244 Group in 2014 who worked on this project, we were not able to reproduce the XCP plot either.  However, after discussing with both TA’s, who both suggested that spending more time on investigation different Pareto shape parameter and varying RTT’s would be more worthwhile, we accordingly followed their suggestion.

Extensions

Pareto Distribution Exploration

The paper makes the assumption that flow sizes are Pareto distributed with shape parameter 1.2 and mean of 25. However, what if the shape parameter is modified?  Would RCP still have an advantage over TCP?

The Pareto distribution is applied to model self-similar arrival in packet traffic. It is also referred to as double exponential, power law distribution. Other important characteristics of the model are that the Pareto distribution has infinite variance, when the shape parameter ≥ 2 and achieves infinite mean, when shape parameter ≤ 1.  A shape parameter closer to 2 also means it’s more heavy-tailed with more short flows.  We were curious of whether RCP’s flow completion time would be sensitive to varying the value of the shape parameter.

pareto_shape_plots

Figure 2: Our graphs with the different Pareto Shape values: 1.1, 1.2, 1.4, and 2.2.

As Figure 2 demonstrates above, RCP exhibits more or less the same shape on a semi-log plot.  With shape = 2.2, where the distribution is dominated by mostly short flows, it makes sense to see less noise on the flow-size axis.  We conclude that RCP is not sensitive to shape parameter changes in a Pareto distribution and still performs better than TCP.

RTT

We thought it was interesting that an RTT of 100ms was chosen for this experiment. According to [Sessini] and [Shakkottai], the average RTT of TCP flows on the Internet is around 170 and 180ms, respectively. In this paper, the authors use an RTT of 100ms, with no justification given. We thought this was curious, and tried plotting the original graph with an RTT of 50ms and 200ms.

rtt-comparison-rcp

Figure 3: Our graphs with the different RTT values: 50ms and 200ms.

(Note: Our original plot for RTT=200ms turned out to be incorrect.  Thus, this section is re-written post-deadline.  We’ve included our original text/plots in the Addendum.)

For both RTT = 50ms and 200ms, the RCP line lies above the PS line.  It is interesting to see that a high RTT (200ms) generates a graph that exhibits very similar characteristics as a plot with RTT=100ms and shape parameter=2.2.  For reference, the equation for Processor Sharing (PS) is given by Figure 4, where L is the flow size, C is the link capacity, and ρ is the offered load. 

ps-equation

Figure 4: Processor Sharing equation.

These plots suggest that RCP is not sensitive to different RTT values.

Critique

From our investigation, the original thesis of the paper holds.  That is, with Flow Completion Time (FCT) as a metric for congestion control, RCP is experimentally shown to be a more practical algorithm compared to TCP.  We observed the same phenomenon in all the experiments we ran.  However, there are some built-in assumptions in the paper that are not fully elucidated.  For example, though Pareto-based traffic models are excellent candidates, there is “no one single model that can be used effectively for modeling traffic in all kinds of networks.” [Survey].   We found that RCP is not sensitive to shape parameter variation or RTT variation, but we would like to see RCP implemented within a data center network.  It is also interesting to see that a high shape parameter value of 2.2 produces a plot very similar to a high RTT values of 200ms.

Platform

We chose to use Ubuntu 14.04 as our platform because it is one of the more widely-supported Linux distributions. We used NS-2 for the network simulation because that’s what the original experiment used, although we used a more updated version (NS 2.35 vs. NS 2.30). The benefit of using a simulator is that time advances based on events; meaning that we can run experiments that would run way faster than they would in real life and under conditions that might not be possible with an emulation or actual setup.

Method

We started by patching NS-2 with the code to support RCP, which was supplied by the authors on their website [RCP]. We had to make some modifications in order for it to work with the newer version of NS-2.

The authors also provided scripts to run on NS-2 which will produce the data for TCP, RCP, and XCP. We modified these scripts so they ran with the parameters we wanted. Each run generates many flows whose size follows a Pareto distribution with shape parameter of 1.2 and mean of 25, and whose arrivals follow a Poisson distribution. The simulation setup is as follows, with an RTT of 100ms, capacity of 2.4 Gb/s and offered load (the amount of traffic in the queue) of 0.9.  In the Extension section, we explored how RCP approximates PS with varying the Pareto shape parameter and RTT.

diagram

Figure 5: Experiment setup.

We then created a Python script that aggregated this data, generated the data for the processor sharing and slow-start curves, and plotted the data.

Reproduction Instructions

There are three options for if you want to reproduce the graphs:

  1. Amazon EC2. Search for an Amazon EC2 AMI called “CS244-2015-RCP” in the US West Oregon cluster. This contains all the code and setup needed for the experiment.
  2. Locally. Download the virtual machine image here.
  3. Manually. Follow the “For the Brave” steps below.

For the Brave

  1. Locate an Ubuntu 14.04 installation. We used an Ubuntu 14.04 server on an Amazon EC2 c3.large instance.
  2. Make sure the following are installed on Ubuntu: git, gcc, make, g++, tcl, libx11-dev, libxt-dev, nam, perl, python, python-matplotlib, python-scipy.
  3. cd ~
  4. Download the all-in-one version of ns-2 by running wget http://sourceforge.net/projects/nsnam/files/allinone/ns-allinone-2.35/ns-allinone-2.35.tar.gz/download.
  5. Untar the package by running tar -xvf download.
  6. Clone this repository by running git clone https://github.com/anjoola/cs244-rcp.git.
  7. Patch the ns-2 code by copying the files in ~/cs244-rcp/patch/ and replacing the ones in~/ns-allinone-2.35/ns-2.35/. Do this by running cp -r ~/cs244-rcp/patch/* ~/ns-allinone-2.35/ns-2.35/.
  8. sudo ./install in ~/ns-allinone-2.35/. This will take a while.
  9. Add the following to ~/.bashrc, then restart bash:
     # LD_LIBRARY_PATH
     OTCL_LIB=~/ns-allinone-2.35/otcl-1.14
     NS2_LIB=~/ns-allinone-2.35/lib
     USR_LOCAL_LIB=/usr/local/lib
     export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$OTCL_LIB:$NS2_LIB:$USR_LOCAL_LIB
    
     # TCL_LIBRARY
     TCL_LIB=~/ns-allinone-2.35/tcl8.5.10/library
     USR_LIB=/usr/lib
     export TCL_LIBRARY=$TCL_LIB:$USR_LIB
    
     # PATH
     NS=~/ns-allinone-2.35/ns-2.35/
     NAM=~/ns-allinone-2.34/nam-1.15/
     PATH=$PATH:$NS:$NAM
    

Generating Results

This will take about 1-2 hours on an Ubuntu 14.04 VM with 4 GB of RAM on a host OS with a 1.8GHz Intel i7 processor, or 30-60 minutes on a 2.6Ghz Intel i5. On an Amazon EC2 c3.large instance, it will take about 20-40 minutes.

cd ~/cs244-rcp/scripts
chmod +x run.sh
./run.sh

All the data is produced in the ~/cs244-rcp/scripts/data/ folder, and the plots are in ~/cs244-rcp/scripts/{semilog, loglog}-plot-[rtt]-[shape].png. This only generates the main graphs (it would take too long to run the extensions).

If you wanted to produce ALL the graphs (this is not recommended), do the following, which will take several hours:

cd ~/cs244-rcp/scripts
chmod +x run-extensions.sh
./run-extensions.sh

Hint: Use CTRL+ALT+T to launch a terminal if you’re using the Ubuntu VM.

Addendum

RTT_plots

This section contains our original analysis for our extension with respect to different RTT values as well as the original critique.  The original plot for RTT=200ms turned out to be incorrect, and we fixed it in the main blog post.

For RTT = 200ms, the RCP line is below the PS line for a good proportion of the flow sizes. This is interesting since processor-sharing is supposed to be the ideal, but RCP on average does better!  For reference, the equation for Processor Sharing (PS) is given by Figure 4, where L is the flow size, C is the link capacity, and ρ is the offered load. We thought that maybe we were not adjusting our line for PS properly with the correct RTT value. However, this was not the case, and we looked further.

ps-equation

Figure 4: Processor Sharing equation.

In [PS], Dukkipati et. al said that RCP is implemented that “allows initial rate to be calculated during the initial handshake by piggy-backing on the SYN and SYN-ACK messages.”  This is especially important for short-lived flows, as it could last less than one RTT.  This insight confirms with our plots.

*Original Critique Below

From our investigation, the original thesis of the paper holds.  That is, with Flow Completion Time (FCT) as a metric for congestion control, RCP is experimentally shown to be a more practical algorithm compared to TCP.  We observed the same phenomenon in all the experiments we ran.  However, there are some built-in assumptions in the paper that are not fully elucidated.  For example, though Pareto-based traffic models are excellent candidates, there is “no one single model that can be used effectively for modeling traffic in all kinds of networks.” [Survey].   We found that RCP is not sensitive to shape parameter variation, but we would like to see RCP implemented within a data center network.  In addition, theoretically it is not possible to outperform PS, and yet in running an experiment with RTT = 200ms (rather than 100ms as in the paper), we saw that for short-lived flows, RCP outperforms PS.  Note that it is also possible that there there hardware measurement errors in the NS itself, especially given the high speed (2.4Gbits/s) of the bottleneck link.

References and Links

Advertisements

One response to “CS244 ’15: Is Flow Completion Time a Better Measure for Congestion Control?

  1. Following the reproduction instructions for Amazon EC2, we were able to easily run the experiment and generate the plots. In general, the instructions are clear — the only thing missing is that the user must log in as “ubuntu”. The plots themselves are consistent with the results from the paper, and everything works as expected within a reasonable time-frame. We did not test the image locally, nor did we test the manual instructions.

    We did have some issues running the extensions, which the post warned may not work. In order to generate the plots for the extensions, we had to add “rtt=0.1” to line 43 of run_extensions. This allowed us to generate everything but the 200ms RTT plot, which was broken.

    The analysis is both thorough and easy to understand, and raises some valid concerns with the original paper. In particular, the extension exploring whether RCP was sensitive to shape parameter changes in the Pareto distribution adds greatly to the validity of the original authors’ claims about the RCP protocol.

    (5/5, 3/5 for the extensions)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s