Original Goals, Motivation, and Results
The internet was built for redundancy. Recently, its endpoints have begun to build-in redundancy as well. Servers are multi-homed, so as not to depend entirely on a single last-mile connection. Datacenters utilize redundant paths for their throughput and latency benefits. Mobile devices communicate over multiple radio technologies so as to maximize mobility and availability (mobile broadband) while providing better network characteristics and minimizing total cost when possible (WiFi). However, the transport protocol that runs most of the internet is still limited to a single path from start to endpoint. The MPTCP paper by Raiciu et al. aims to provide a transport level mechanism that can take advantage of the multi-path nature of the current internet, while still providing the efficacy of TCP.
The authors’ results were very positive overall. They designed a multi-path TCP algorithm that achieves performance gains in most important metrics over single-path TCP. From an endpoint perspective, MPTCP should perform as good or better than single-path TCP, and should be backwards-compatible with current TCP implementations, so the incentives for adoption would indicate that the internet is ready for MPTCP. They attempted to accommodate most types of currently available middleboxes, but can by no means guarantee that the entire internet will be able to handle MPTCP. Nonetheless, it appears that their final design should be workable and beneficial to almost all end users.
We attempt to recreate the paper’s latency results (as seen in Figure 7). A well-designed MPTCP implementation will have obvious benefits to overall throughput. However, if that benefit comes with an unacceptable latency trade-off, MPTCP will not be useful for many interactive applications that require high perceived responsiveness. The authors examined a simulated mobile scenario, with MPTCP operating subflows over both 3G and WiFi. Intuitively, the latency would tend to be worse than the WiFi-only case. This is an acceptable cost for the gains in throughput, as long as the amount of “worse” is reasonably low. The authors found that “regular” MPTCP does match intuition: The probability distribution of latency for MPTCP does have a heavier tail than for TCP over WiFi. However, with two optimizations to their MPTCP design, the authors were able to achieve better latency than single-path TCP over WiFi. The better performance can be explained by the large buffer size, compared to the effectively smaller buffer when MPTCP is also using 3G. Nonetheless, comparable latency to the pure WiFi case is an important accomplishment that we wanted to explore further.
After setting up a similar topology, we were able to reproduce results very similar to the latency results from the paper. TCP over Wifi alone has a narrow, low-latency distribution, while TCP over 3g alone has a higher-latency, wider, more uniform distribution. MPTCP without the optimization mechanisms is able to gain throughput by utilizing both links, but suffers from the additional latency of 3g. MPTCP with the optimization mechanisms is able to obtain essentially the same distribution as TCP over Wifi. Our results match the results from the paper, with MPTCP with optimizations appearing to perform slightly better than Wifi, even though, as explained in the paper, this is due to MPTCP’s send buffer being effectively smaller due to the large RTT of 3G.
Though earlier experiments by the authors had specifications on link attributes, there were no exact topology specifications for the data we were attempting to reproduce aside from TCP send buffer size. Fortunately, other results from the paper were generated from simulations/emulations, so some baseline specifications were provided. Using these, we were able to generate a realistic topology that accurately recreated the intended results.
Figure 7, from the paper
Figure 7, from our simulation
Because MPTCP is not yet in the mainline Linux kernel, we had to both build a snapshot of Linux with MPTCP, as well as patch that snapshot to add systctls that allow disabling mechanisms 1 and 2 from the paper. Additionally, Amazon EC2 requires several Xen flags to be on for the kernel to boot, which we learned the hard way after rendering our instance unbootable. Many thanks to the group that helped us get our kernel booting!
While trying to replicate the results from the paper, we misread the 200 KB TCP send buffer as a link buffer size, and used Mininet’s queue length to set that parameter. After correcting that, we found that our results were roughly matching those of the original paper.
The metric we chose to explore was “Application level latency.” As prescribed by the paper, we measured this as the time elapsed between when a packet was accepted by the TCP stack and when it was read on the receiver end. At first glance, this appears to be misguided. An application’s latency depends not only on the time spent by a packet in transit, but also on the time spent waiting for the TCP send buffer to have space to accept more data. We questioned this at first, but determined that this methodology is a good proxy for gauging application level latency, and is probably better than attempting to measure application level latency directly. The rate at which an application can generate data to send is highly dependent on core computational performance. By measuring the “send time” as the moment that a packet is fully accepted by TCP, we are effectively limiting our measurement to the performance of TCP, eliminating possible noise from variation in computation power. Though we are still impacted by compute performance on the receiver side, the impact is much less, and does not have any upstream effects (so long as the receiver keeps up with the rate of delivery).
3G is a fairly jitter-prone link technology (packet segmentation, link-layer retransmission). We examined the effect of increased jitter on the latency results, and found that even as jitter increases significantly, MPTCP with opportunistic retransmission and slow subflow penalties successfully avoids increases in application-level latency. MPTCP without these mechanisms performs closer to 3G latencies as 3G jitter increases.
The original experiment may have been done in a simulator or on a real physical topology; we chose to emulate it using Mininet. A real topology would have been more difficult to procure, and Mininet allowed us to easily experiment with various parameters of the topology. The results from the paper are fairly reproducible, so long as the Wifi and 3G links are specified reasonably, particularly for loss rate. A high loss rate on either link will affect latency significantly, but this only amplifies the benefits of MPTCP.