Team: Jason Teplitz, Jintian Liang
Paper: Andrei Croitoru, Dragoş Niculescu, and Costin Raiciu
Towards Wifi Mobility without Fast Handover
USENIX NSDI ’15
This paper looks to address the problem of WiFi mobility on mobile devices. Specifically, the authors focus on the problem of handover when a device leaves an access point’s (AP) coverage area and enters another. Most research in this area has focused on fast handover, which is the process of quickly finding and associating with a new AP whenever a device leaves an AP’s coverage area. However, this approach has some fundamental flaws. Most importantly a client must break their connection with the first AP before they can begin connecting to another. This disrupts all their active flows, and is not tolerable for something like a VoIP call. This approach also provides no way to adjust to changes in signal strength with the associated AP. For example, if the device picks an AP with poor signal strength after a handover it has no way to improve its signal strength without repeating the expensive handover operation.
To fix this problem the authors suggest having devices connect to multiple APs at the same time and multiplex their connections over MPTCP. This approach solves the fundamental problem of having to disconnect while performing a handover, and it allows for dynamic allocation of flows to APs based on signal strength. However, it comes with its own set of challenges. Most notably, it has to deal with interference caused by sending packets to multiple APs at the same time. The authors spend much of the paper discussing this problem, and we’ve chosen to reproduce their results in one experiment related to this problem. In this experiment the researchers try connecting to two APs that are not within carrier sense range of each other and multiplexing their connection between them. Initially, the device is closer to one AP but throughout the experiment the device moves away from that AP and towards the other AP. The researchers then graphed the throughput over time using their MPTCP scheme which connects to both APs simultaneously, a UDP scheme that also connects to both APs, and the results obtained from connecting to each AP independently. Their results are shown in figure 2 below.
Results of Paper
The results from the UDP1 and UDP2 flows are not surprising. In those tests the authors connected to a single AP using UDP. These results indicate that as the device moves away from one AP its throughput with that AP deteriorates. The interesting result in this experiment is the improvement in using MPTCP over UDP in the area between the two APs. They find that in this situation the MPTCP flow is able to get nearly double the throughput of the UDP flow. Figure 2c gives a bit of an intuition as to why this is happening. As the device moves between the two APs one AP’s signal strength is superior to the others. This means that the signal from the further AP deteriorates more and is thus damaged more by the relatively stronger signal from the nearby AP. As a result the AP with inferior signal strength experiences a significantly higher degree of packet loss as shown in Figure 2c. Thus the congestion window for that flow is significantly smaller, which leads to less noise interfering with the other flow. We found these results to be a bit surprising and encouraging and are working to reproduce them.
We chose to reproduce this specific set of results partly because it best captures the goals of the researchers, and it expresses these results in a clear fashion. Most importantly, however, these results are surprising. At first glance, one would not expect MPTCP to outperform the UDP scheme that connects to both APs by so much. However, the results show a large difference in performance, due to the phenomenon described in the above paragraph. Because the results are unintuitive, we wanted to reproduce the experiment to see if our results will match the results produced by the researchers’ original experiment.
We were specifically drawn to this paper because of the novelty of the idea as well as the real world impact of this problem. Mobile devices are increasingly becoming people’s primary computing devices, and WiFi networks form a strong coverage area over dense urban areas. WiFi networks are a compelling alternative to cellular networks and are frequently able to offer higher throughput in a more open network that typically has a lower monetary cost per byte. However, the handover problem continues to hamper the ability to use WiFi exclusively on mobile devices. Even on Stanford’s campus it is not possible to have a VoIP call over WiFi while moving around campus. This paper offers an interesting and compelling solution to this problem.
Figure 1: This is analogous to Figure 2b in the original paper. We positioned one WiFi node on the X-axis at position 0, and another node at position 150. We then ran 4 experiments at 5 meter intervals along the x-axis. We tested connecting to just one router (these are the UDP 1 and the UDP 2 flow lines), UDP1+2 is the cumulative throughput of connecting to both routers simultaneously over UDP, and TCP1+2 is the cumulative throughput of connecting to both routers simultaneously over TCP.
The results of our simulation were mostly consistent with those of the original paper. Our Figure 1 very closely resembles their Figure 2b. For the UDP1 flow, throughput drops as we move further away from its corresponding router. For the UDP2 flow, throughput increases significantly as we move closer to the second router. For the UDP1+2 flow, throughput degrades rapidly near the middle due to packet collision. TCP1+2 maintains stable throughput throughout the entire simulation due to the starvation phenomenon described in the original paper.
Figure 2: This is analogous to Figure 2c in the original paper. With two senders at position 0 and 150, we placed our receiver in the middle and looked at the percentage of packets dropped as a function of the maximum number of retransmissions allowed in the MAC layer. NOTE: This graph tends to vary for x=2 retransmissions. Across multiple trials, the y-value ranges from 0.5 to 0.2
In Figure 2, we see that all of the packets are dropped if no retransmissions are allowed. This makes sense as the lack of carrier sense causes a high percentage of packets to collide and subsequently be dropped. As you increase the number of retransmissions allowed, one flow starves out the other flow and experiences a very low percentage of dropped packets. This is similar to the results obtained by the original researchers with one exception. In our simulation, we see that the percentage of packets being dropped actually increases for both subflows near the tail end of the graph. We’re not sure why this happens, but our hypothesis is that by allowing a high number of retransmissions, one increases the chance that the starved flow successfully sends its packets, making it more competitive with the superior subflow. This, in turn, increases drop percentages for the superior subflow. Further research would more definitively explain this phenomenon.
We faced a number of implementation challenges while reproducing these results. Initially we were not sure what the proper carrier sense threshold should be to ensure that the routers were not within range of each other. This was compounded by a bug in the latest version of ns2 where our experiments would immediately terminate for certain carrier sense thresholds. We tried to fix this bug in the ns source code, but eventually reverted back to an older version of ns that was used by the paper’s authors. After reverting to the older version, setting up the environment became easier but was still more challenging that we expected due to the generally difficult ns2 installation process and the specific edits that we had to make to the ns2 codebase.
The authors of the original paper were incredibly responsive and helpful. Specifically we were in touch with Professor Niculescu from the Politehnica University of Bucharest who provided invaluable aid, and shared specific details about their experiment as well as source code that made it possible for us to reproduce their results.
We found that the results of the paper generally hold true. Using MPTCP gives stable throughput even when one moves around from access point to access point. The starvation phenomenon due to TCP’s slow start explains why using multiple UDP flows doesn’t give the same stable throughput. With that said, we feel that more research needs to be done to verify the paper’s findings. For starters, we experienced high variability, particularly for our simulation looking at the percentage of packets being dropped. While our results mostly match the researchers’ in the middle region, the tail-end of Figure 2 differs from the analogous graph in the paper. We feel that more experiments need to be done to see why this is the case. Furthermore, we feel that more real-world experiments should be done to verify the claims made in the paper. While the authors did perform a real-life experiment with hidden terminals, this was only in one building and with very little interference. This is hardly indicative of a real-world use case.
Extending to Three Flows
In the paper the authors only explore two simultaneous flows and a node that travels along a single axis. However, the main idea of the paper is to have WiFi nodes connect to a large number of access points in a dense urban area. We decided to build on the original author’s work by adding a third router and moving into two-dimensional space. For this experiment we placed three WiFi access points at the vertices of an equilateral triangle such that they were outside of carrier sense range of each other. We then moved a node along the interior of the triangle in a zigzag fashion. I.E. we started at the bottom axis and travelled along it at 10 meter intervals. Once we reached the far side of the triangle we repeated the experiment at a new y value that was 10 meters above the hold y value. At each location we ran an experiment where we connected TCP flows to all three routers, and an experiment where we connected UDP flows to all three routers. We measured the cumulative UDP and cumulative TCP throughput at each location and graphed these values. The result is a 3 dimensional heat map that shows the relationship between location and throughput for each protocol. We’ve included these graphs below. Since they are 3D graphs we’ve included three different rotations for each graph.
Our findings are highly encouraging and generally agree with the findings for the two dimensional experiment. UDP throughput is very high when you’re within a rather small (~15 meter) radius of the access point. However, as soon as you move into the interior of the triangle the throughput drops rapidly to 0.
By contrast, MPTCP is fairly constant throughout the entire region that we’ve graphed. This is encouraging and means that a WiFi node should be able to maintain a constant connection to the internet without any handover interruptions and a decent throughput while moving within the coverage areas of these APs. However, there are still some dead zones for the MPTCP flows, which we did not see in the two access point scenario. Future work could investigate what is causing these. Here’s our produced graphs:
Figure 3: Cumulative throughput as a function of (x, y) location of three TCP flows. Each flow connected to a unique router at the edge of the triangle.
Figure 4: Cumulative throughput as a function of (x, y) location of three UDP flows. Directly analogous with Figure 3 for TCP. As you can see, there is a sharp degradation in the middle due to increased packet collision.
Figure 5: Bottom up view of Figure 3. Note the z-axis (throughput) has been inverted. There are a handful of dead zones where throughput drops almost to zero.Figure 6: Bottom-up view of Figure 4. Directly analogous with Figure 5 for TCP
Figure 7: Top down view of Figure 3. Red indicates low throughput, while blue indicates high throughput. Yellow indicates decent throughput.Figure 8: Top-down view of Figure 4. Directly analogous with Figure 7 for TCP. Note that the red portion indicates low throughput while the blue portion indicates high throughput.
Platform and Reproducing our Results
We ran our experiments on Amazon EC2 t2.small and t2.medium instances using ns2. The experiments don’t process any large volume of packets, so unsurprisingly we did not see any differences in the data produced by either type of instance. The experiments take roughly the same time on either type of instance since we run all of our experiments for a set duration.
We highly recommend using EC2 to reproduce our results. Instructions for doing so are as follows:
- Search for the “ns-2.34-w/NOAH-no-ARP” AMI on the community marketplace (N. California region)
- Launch a new t2.medium or t2.small instance with this AMI.
- Login as ‘ubuntu’
- Clone https://github.com/jteplitz602/WiFi-HandOver
- Execute “run.sh”
- The experiments should take about an hour to run regardless of your instance size.
- All graphs should be outputted in the same directory as “.png” files.
Note: For the 3D graphs we save three different rotations. However, the best way to view them is to view the matplotlib plot directly because you can rotate the graph yourself with the mouse. If you’d like to do, this you can edit our “graph_tree_flows.py” script by uncommenting the last line.
If you choose not to use our AMI, things get a lot more difficult. First you must install ns-2.34. The experiment does not work with newer versions of ns2. There appears to be a bug in newer versions that causes the experiment to quit prematurely when using specific carrier sense thresholds. To set up the environment you should roughly follow these steps:
- Follow the instructions to download and install ns2 from “http://www.isi.edu/nsnam/ns/ns-build.html”. Make sure to install version 2.34! Install the dependencies but do not install ns yet.
- Add the NOAH protocol to the ns source code following the instructions here. Note those instructions are for an older ns version so the line numbers do not match, but the basic idea is the same.
- Disable ARP inside ns2 by following the instructions here.
- Compile ns2 and run install. This will probably fail the first couple of times asking you to install additional dependencies. These should all be available on apt if you’re running debian.
- Install matplotlib and numpy
- At this point you should be able to clone the repo and run our script using the instructions for the EC2 instance
Again we do not recommend this approach. It involves making very specific edits to the ns2 code base and is tricky to do properly. The AMI that we provide already has the correct version of ns2 installed that we’ve modified appropriately.