CS244 ’16: Revisiting TCP Pacing on the Modern Linux Kernel


Chase Basich (cdbasich) and Kelvin Do (kelvindo)

The Original Paper

Introduction and Motivation

Many modern TCP congestion control mechanisms lead to bursty traffic flows. Traffic leaves a sender in a window size burst of traffic, and then waits an RTT before sending another burst. It would seem that high variation of rate at which packets are sent would cause congestion since routers may have to handle many bursts simultaneously, filling up buffers and causing packets to drop. A question arises as to whether a non-bursty traffic pattern changes the performance of network traffic for better or for worse.

Original Goals of the Paper

In Understanding the Performance of TCP Pacing (Aggarwal et. al, 1999), the authors propose a non-bursty version of TCP that evenly spaces the sending of packets over an RTT instead of in a single burst at the beginning of an RTT. Intuitively, this seems like it would reduce delay and increase throughput since routers will receive packets with lower variation in rate. To test this, the authors of this paper set up a network topology of 50 flows transferring packets over a single bottleneck link. They ran two experiments, one using traditional TCP Reno and another using TCP Pacing, a version of TCP they developed to achieve the desired packet pacing characteristic described above. The throughput of the aggregate flows was then measured for each TCP implementation and the results were compared.

Original Results

The results showed that for an individual flow Pacing increased throughput over regular TCP Reno. When the number of flows was increased to 50, Pacing actually decreased throughput during some parts of the transmission. This was caused by synchronization effects from TCP Pacing. For example, during startup, all the flows synchronized their slow start until the link was oversaturated, at which points all the flows experienced synchronized drops, leading to an immediate undersaturation of the link. These synchronization effects cause synchronized drops which lead to the throughput to oscillate greatly throughout the life of the flow. Burstier traffic causes individual flows to experience unsynchronized drops, thus keeping the link busy and throughput high with the traffic from other flows. In the end, this shows that Pacing show not be used as a method congestion control in practice for real world flows.

(Figure 3) TCP Pacing vs TCP Reno Throughput for a Single Flow

im1

(Figure 6) TCP Pacing vs TCP Reno Throughput for 50 Flows

im2

 

Our Results

Subset Goals and Motivation

Our goal is to reproduce figures 3 and 6 from Understanding the Performance of TCP Pacing. Each shows the single flow throughput and cumulative 50 flow throughput respectively. We chose these figures because they clearly show how TCP Pacing can be advantageous in one situation, but suffer in the case of many flows. We want to confirm the result that though the single flow case seems to imply TCP Pacing speeds up performance, the 50 flow case which is more similar to real world network conditions would experience a decrease in performance.

Subset Results

Below are figures we have generated from measuring cumulative throughput of single flows for TCP Throughput and TCP Reno Respectively.

Cumulative Throughout for Pacing for a Single Flow

im3

Cumulative Throughout for Reno for a Single Flow

im4

Cumulative Throughout for Pacing for Five Flows

im5

Cumulative Throughout for Reno for Five Flows

im6

We opted to use a tool we were more familiar with to measure throughput: iperf. We know that iperf will take a congestion control algorithm and use it to blast packets across a link and measure the throughput at the receiver to see how quickly packets are moving across the link. As a result, we did that at 1 second intervals during the duration of the flow and plotted the results that iperf returned for the throughput of the link at a given time.

Challenges

Conceptually, the idea of TCP Pacing isn’t that difficult. We understood the idea of pacing the sending of packets over an RTT instead of in a single burst. However, some of the graphs took some effort to interpret. One of the first questions we had was what is cumulative throughput? How does a single flow have cumulative throughput? Is it cumulative over a period of time or cumulative over all the flows? We had to read the graphs carefully and get context for what the data was actually representing.

Implementation was a huge challenge in our reproduction of this paper. First, we took on the daunting task of editing the linux kernel in an attempt to introduce TCP Pacing into linux. We did this by taking an existing patch that was used to patch an older version of linux (TODO: What version was the patch for?). We then introduced changes into the patch to make it work on a modern version of linux. Then we patched the linux kernel itself and attempted to recompile it. The first challenge was the compile time. At first we were using a very small EC2 instance that took several hours for the kernel to compile. This is especially an issue if we ran into any compile errors. We upgraded our instance and we were successful in implementing a version of TCP that could communicate with other servers. However, in the end, we couldn’t quite replicate the cwnd chart that we expected for TCP pacing and debugging the linux kernel proved to be a task that we could not finish within a week.

Another challenge we faced was that some EC2 instances simply didn’t work the same as others. On one instance, we tried tracking down an issue that didn’t allow TCP probe to run. After several hours of work, we spun up another instance and TCP probe worked without any issue. Problems like this made it very difficult to determine if a setback was our doing or not. Other times EC2 slowed to an unbearable speed making it impossible to develop anything or update any packages. These resource problems gave us a taste of the types of problems one can face in research and industry when you depend on a third party service for computing power.

Conclusions and Critique

Clearly, we see a large discrepancy in the charts for cumulative throughput. The paper reveals a plot that consists of a smooth increase to a mild oscillation of throughput of 3.75Mbps, we saw very wild oscillations centered around 4.75Mbps. One reason for the wild oscillations is that we used iperf as a method of measuring cumulative throughput of the system. To our understanding, iperf attempts to blast as many packets as possible across a link using a congestion control algorithm of our choice and measures how quickly the packets are received on the other side.

Although the plots aren’t similar at all, we find that the oscillations of our plot coincide more with the stable sawtooth of the cwnd. This makes sense as cwnd is a measure of how quickly packets are being sent, and as a result, should also be a direct indicator of the throughput of the system. Perhaps this isn’t the exact same cumulative throughput that the original paper found, but we felt it necessary to attempt to reproduce the networking research performed on TCP pacing using other measures of cumulative throughput. Overall the experience was very informative and taught us a lot about using different tools to measure different metrics of a system.

Between the TCP Pacing and TCP Reno cumulative throughput charts that we produced, we still see some minor differences. The first is an overall lower average throughput for Pacing compared to Reno. This is a difficult phenomenon to explain as we expected Pacing for a single flow to be greater than that of Reno. Perhaps as the packets were paced, iperf registered a lower rate of packets being sent, and as a result, reported lower throughputs for Pacing. Another difference is that the peaks and troughs for Pacing are a bit flatter than that of Reno.

Extensions

The primary extension that we present is an updated patch for TCP Pacing on the modern linux kernel. Past patches have introduced TCP pacing to the kernel for older versions of linux, but we think that updating it for a modern version is a valuable contribution that shows us whether TCP Pacing still has the same behavior that as been seen in the past. Although our cumulative throughput charts don’t show the same results using iperf, another attempt to reproduce this paper could use our implementation of the kernel patch and take other methods of measuring cumulative throughput on a modern linux kernel

Reproducing Results

There are two ways to reproduce our results, depending on how much work you want to do. Both methods are available in the readme in our git repo

Platform

The original authors implemented TCP Pacing in ns (network simulator), but due to our familiarity with mininet from programming assignment 1 we chose to use mininet.The structure of the network is well defined in the paper: 50 hosts (the “senders”, S1-S50) connected to switch 1, 50 hosts (the “receivers” R1-R50) connected to switch 2, and a bottleneck link between switch 1 and switch 2. While we succeeded in setting up such a topology in mininet, we encountered unforeseen issues. In particular, we found that mininet supports very limited bandwidth. Due to this low limit, which occurred when we exceeded 5 connections at once, and iperf began to return results that were not well formed. This forced us to vary the parameters of the original experiment until we found a set up that, while it was not a faithful reproduction of the original results, at least gave us well formed data.

TCP Patch

In order to run TCP Pacing, we had to patch the linux kernel, as congestion control is implemented at the kernel level. We based our patch on a similar patch for an old version of Linux. However, this patch was both out of date, and incorrect. Even after updating the patch for the newest version of linux, there were numerous issues with the original implementation. However, even after fixing all the bugs we could find, the output of our program still failed to match the expected output, leading us to one of two conclusions. The first, is that there were simply bugs in the implementation of the patch, either introduced by the original author or by us. The second, is that the implementation of TCP in the linux kernel has changed enough that there is no longer any value in basing a patch on the original TCP Pacing patch. The underlying logic for TCP has undergone significant revisions, and it is possible that there are entirely new sections of code that have to be patched for an updated version of TCP Pacing.

2 responses to “CS244 ’16: Revisiting TCP Pacing on the Modern Linux Kernel

  1. Score 5 in terms of reproducing the blog post. However, as the authors pointed out, there is a large a discrepancy from the original paper. The authors believe the reason is that they used iperf as a method of measuring cumulative throughput of the system. We think they should measure the throughput using mahimahi instead, so that it could be possible to reproduce the result. Reproducing this paper shouldn’t be hard, and requires not too much efforts.

  2. 5/5 Following the instructions on bitbucket, we were able to reproduce the graphs presented in the results section using the community AMI. However, as the authors mentioned, these graphs are pretty different from results in the original paper, probably due to their choice of iperf as the measuring tool. In retrospect, the authors could have used mininet/mahimahi to improve the accuracy of the reproduced experiments.

Leave a comment