CS244 ‘24: TACK: Improving Wireless Transport Performance by Taming Acknowledgements


Hunter Hollenbeck and Joseph Moser

Introduction

This paper, authored by Prof. Weinstein and Huawei work to reduce congestion along a wireless link by reducing the amount of acknowledgements sent. Reduction in ACK sending frequency is done in two ways. First, the frequency of ACKs is defined as the minimum of two sending schemes: byte-counting ACK, or ACK per x packets, and periodic ACK, or ACK per x time. Both of these have their pitfalls in solitude because of limitless behavior at high throughput and low throughput respectively. Second, more information is added to the ACK to make it increasingly stateful such that recovery from loss is easier and these delayed ACKs are tolerated. These two additions, coupled with other additions including send rate control and advanced RTT calculation, deliver promising results for wireless networks and most notably streaming applications. 

Specifically, the contribution of this paper is in its ability to both rethink acknowledgement sending protocol and how its rethinking can be tailored to specific applications. Their application is in streaming, specifically that Miracast uses the TACK scheme for wireless projection from Huawei smartphone to TV. Using TACK here reduces the rate of buffering compared to other TCP and UDP schemes, thanks to its reduction in ACK frequency that leads to a higher throughput.

The motivation for reducing ACK rate stems from the fact that wireless transmission is a shared medium, necessitating a pause in receiving packets before one can be transmitted. Typical TCP flows send one ACK for every two full-sized data packets, incurring significant overhead. Therefore, reducing the number of ACKs should allow the forward data flow to proceed uninterrupted.

Goals and Chosen Result

The result from the paper that is the most illuminating to us is seeing the reduction of ACKs and goodput improvement as a result of TACK. When experimenting in WLAN, the authors claim that they see a reduction of up to 99.8% of all ACKs and a goodput increase of 28.1% because of their TACK implementation when using 802.11ac. This is certainly a great performance boost with their scheme—attempting to recreate this would both prove the significance of the paper and create an open source creation of TACK versus the (very closed source) version that was created for Huawei’s Miracast. Additionally, this table we sought to recreate also includes values for legacy versions of 802.11 which gives us a metric for what should be possible in the TACK scheme under lower throughputs as a baseline example. Even in 802.11b, there is still a 90.5% reduction of ACKs and a goodput increase of 20.0%. Other graphs and figures in this paper either consider change in one of these over different 802.11 generations or in comparing ACK thinning (mere delay of ack for x packets) schemes to TACK, where this table gives holistic results for both ACK reduction and goodput improvement, the two main features of this paper. 

Methodology of the Paper

Due to TACK being implemented and in use in Huawei devices currently, the methodology is rather closed source from start to finish and blocks the authors from using more user friendly software for replication. They use a commercial network emulator by Spirnet coupled with the TACK implementation built upon TCP in userspace with Netmap. However, in addition to this, they do end up modifying the Linux Kernel and build a TACK implementation on that which is rooted in adding an ACK thinning protocol to Linux for changing ACK frequency based on the parameters required for TACK. They do not discuss any algorithmic choices with their implementation nor the files that are edited to implement the more nuanced parts of TACK, such as the more information in ACK or the RTT calculation for a delayed ACK scheme. 

Methodology We Used

With advice from Keith, our plan of attack was two-fold: first, we were to recreate TACK in a simulator (we worked with ns-3) to get a proof of concept and then translate that into Linux by developing a kernel module. In the Kernel module, we were to do an evaluation across a wireless network and evaluate the performance. Another avenue of testing we considered was fetching real traffic data from the internet and communications with a Raspberry Pi to consider the real world capabilities of reducing ACK frequencies by so much. 

This is certainly a divergence from what was done in the paper, especially in the paper’s discussion of creating a program to repeatedly feed dummy packets from sender to receiver to test this. We had to diverge in this way because we were not able to devote the time in one quarter to making such a system, nor did we have a commercial switch like what was used in the paper. 

Results

While much time was spent working to make ns-3 simulation work as well as implementation of a kernel module beyond mere ACK thinning, neither were fruitful in the end. The class interfaces in ns-3 separate ACK behavior from the congestion control algorithm, making it hard to modify the two in conjunction, leading us to pivot to trying to build a simplified TACK system operating between the linux kernel and a modified congestion control kernel loadable module. This was waylaid repeatedly by issues with compiling and installing kernels across multiple devices and VMs. The most reasonable results we got to compare to the paper were in running tests with traffic data in wireshark and calculating how the frequency would change if Tame ACK were implemented with the given RTT and throughput values. 

This retroactive TACK implementation involved considering the ACK frequency if an acknowledgement was sent for every data packet and calculating the change in frequency when calculating the TACK frequency value with the minimum expression from the paper. 

In doing so, we achieve an ACK reduction between 91.7% and 94.1% in the 802.11ac specification. In regards to the goodput, we calculated the potential increase due to less ACK packets and got values between 0 and 5%. This is certainly less than what is found in the paper and is most likely due to a couple things such as our ACK reduction not being as high as well as our lack of adjustments with regard to send rate control or loss recovery in the reduced ACK scheme.

Discussion

It is clear that our results are in fact not a replication of what is done in TACK. Of course, this is almost expected when we were not able to implement most of their features in a Linux kernel for a faithful replication. Even if we were to make it to that point, our concern throughout this project was in the obfuscation of specific methodologies for many of these nuanced TACK implementations because this work is used in Huawei phones. If given more time, we could have created faithful reproduction in kernel space and run our tests that way versus the more theoretical simulation we had to settle with. 

Even so, we feel that the results we obtained that reduce the ack frequency even within an order of magnitude of their 802.11ac results puts us on the right track and works to corroborate their results. Especially given key differences in their use of 802.11ac and ours (on the Stanford Wi-Fi network) in network quietness and throughput, where we achieved around 10 Mbps and they reported a capacity of over 500 Mbps, it seems expected to have reduced results based on that. 

To give an earnest critique about the paper despite our shortcomings, our biggest question is in achieving such a reduction in ACK frequency. Going from 95% in 802.11g all the way to 99.8% in 802.11ac is no small feat, and given our analysis using their frequency equation, we wonder how the frequency might get that low with consistency. 

One graph we look at is in figure 8b, where we see how the ACK frequency changes between TCP and TACK for various RTT times in 802.11b and 802.11ac. This helps illuminate how the ACK reduction is very dependent on RTT for their implementation, however we find that the opposite is true—our calculations find that the TACK frequency is more often minimized due to the byte counting term that depends on the throughput rather than the term that depends on RTT. In fact, the term depending on RTT is a couple orders of magnitude higher than the throughput term. It seems that the RTT would have to be quite high to lower the frequency to an appreciable number, let alone 99.8 percent. Coupled with this, the authors worked hard to create a less biased RTT when multiple packets are sent per ACK by smoothing the One Way Delay (OWD), reportedly showing that the original RTT was increasing the minimum by 8-18%. What is interesting about this statistic is that if RTT is 8-18% smaller because of their additions, this implies that the TACK frequency should be 8-18% larger due to an inverse relation. Indeed, it seems that nothing is helping the RTT term be of much use for our work but it serves to have a large impact on theirs. This is interesting because it isn’t a complex algorithm being implemented, it is mere multiplication, so we wonder where we might be going wrong or if their increased throughput is a large factor. This is just another point where we see some of the secrets of the work hidden because of the connections to commercial products. 

Instructions for Reproduction

The data we used and the Jupyter notebook for analysis is available at a public repository here

Leave a comment