Chris Lewis (email@example.com) & Long Zou (firstname.lastname@example.org)
Confused, Timid, and Unstable:
Picking a Video Streaming Rate is Hard by TY Huang et al.
The code for the project is publicly available! Get it at www.bitbucket.org/cmslewis/cs244-netflix-video-rate-experiment.
Motivation & Goals
Netflix has become one of the preeminent video streaming services on the Internet over the past few years. With over 23 million active users streaming videos each month, the site is single-handedly responsible for over one-third of peak internet traffic in North America. This demands a robust infrastructure of geo-distributed Content Delivery Networks (CDNs) to deliver both video and audio via HTTP and TCP to clients. Furthermore, in an effort to optimize user experience on the site, Netflix opts to intelligently choose a video rate for the user based on available bandwidth. This problem of dynamic rate selection is a delicate balancing act: choose a video rate that’s too low, and the video quality is unnecessarily degraded, but choose a video rate that’s too high, and the user will continually overrun their playback buffer. Of course, having the user manually select the video rate is even more problematic, as this would demand constant attention and reaction to the quality of their own internet connection.
As of late 2012, the dynamic rate selection algorithms of Netflix and other top streaming sites (such as Hulu and Vudu) performed well if the user was doing nothing on his computer other than streaming a single video from a Netflix CDN. In this case, the site correctly chose and maintained the video rate that would maximize video quality given the available bandwidth. However, if the user had a long-lived competing TCP flow (such as a large file download or another video stream) running alongside the Netflix stream, then Netflix would drastically underestimate the optimal video rate, choosing a much lower video rate than it should have with its fair share of the link bandwidth. This phenomenon, coined as the downward spiral effect by TY Huang and colleagues in their 2012 paper on the subject, is attributable to a feedback loop that results from these services’ excessive conservatism in the face of reduced bandwidth and their underestimation of the available bandwidth on the link. The authors’ goal in their paper is to observe this effect in controlled experiments, speculate as to its causes, and validate proposed strategies to prevent it from occurring. This problem is interesting and worth solving because it represents a profound opportunity for improvement in video streaming experience in an age when users are multitasking more than ever on the Internet. By efficiently using available bandwidth resources, streaming services can optimize performance for a common usage scenario that can impact millions of subscribers.
The original authors analyzed the rate selection algorithms of three video streaming services—Netflix, Hulu, and Vudu—in a variety of experiments with and without a competing flow. They found that all of them suffered from the downward spiral effect when a competing flow was present. The authors provide an in-depth discussion of the supposed cause. One component is the services’ tendency to reduce the client’s request rate after the playback buffer fills, such that the client requests one 4-second chunk every 4 seconds thereafter. These chunks of data are smaller for lower video rates, causing the client to underestimate the throughput (bytes per second) they receive. The other component of the downward spiral effect is the services’ tendency to overly reduce the video rate when they perceive this lower throughput. Together, these factors result in a feedback loop that causes the video rate to plummet to a level much below the one where it should be, given the available bandwidth. As a first step toward preventing the downward spiral, the authors suggest using a less conservative backoff along with better filtering and bigger request segments. These approaches would address both components in the feedback loop, allowing for more accurate calculations of current throughput and less intense reactions to perceived lower throughput.
Early in the course of this project, we met with TY, Aakanksha, and the two other teams working on this paper to gain a better understanding of the research process that the original authors used, the particular characteristics of the three streaming services that they examined, and the challenges that we would likely encounter. With TY’s encouragement, we settled on Netflix as the single service whose performance we would investigate. This would still allow us to demonstrate the downward spiral effect without having to spend time investigating the unique quirks of the Hulu and Vudu services. She advised us to tackle Figures 7(a) and 7(b) in the paper as shown below:.
Firstly, we felt that these figures most effectively demonstrates the downward spiral effect because they compare and contrast the distribution of throughput experienced by the primary video stream when a competing flow is present or absent. Secondly, TY suggested that these figures would be the most appropriate given our 3-week project time-frame as they were manipulable from the Netflix client interface itself, rather than from an emulator that only mimicked Netflix’s functionality. Thirdly, TY mentioned at the end of our meeting that Netflix’s adopted several of the recommended solutions that she and her colleagues proposed in their paper. We thought it would be interesting to evaluate Netflix’s implementation fixes by comparing the with-competing-flow plot we obtain to figure 7(b) above. We expect our no-competing-flow plot to match with figure 7(a), but our with-competing-flow plot will not present as drastic a difference as figure 7(b), since Netflix has learned to counter the downward spiral effect and maintain high video throughput when its stream is sharing available bandwidth with a competing flow.
Our plot for the no-competing-flow case matches reasonably well with figure 7(a). The CDF for each of the six video rates is concave and spends majority of time close to its maximum throughput rate of 2.5Mb/s. Out plot for the with-competing-flow case shows noticeably worse throughput than the no-competing-flow case, but as expected it isn’t nearly as bad as in the original paper. All six streams still spend at least 50% of the time above 2Mb/s, close to its allocated share of 2.5Mb/s. These results confirm our expectation that the recommendations in the original paper that Netflix adopted are indeed effective in preventing the downward spiral effect, thereby enhancing video streaming performance and user experience in the presence of competition for available bandwidth.
We first had to find the right python library to process our PCAP files. There are many PCAP parsing libraries available online, but after trying out a number of them we found that most of these libraries were poorly documented and had ambiguous syntax. We eventually decided on Scapy because it had the most complete documentation and was also recommended to us by another group working on this paper. Scapy’s syntax allowed us to easily retrieve packet information at each network layer, which proved invaluable in our throughput calculations later on.
After installing Scapy, we next had to install all the supporting python packages we would need. This process was very time-consuming because many of these packages had not been updated for several years, so they were incompatible with either Mac OS X 10.8 or Python 2.7 which we were working off of. We thus had to find the root tar.gz folders and manually install all packages, and to remove this hassle for users we automated this package download/installation step in our run.sh script.
Our third challenge was finding the correct methodology to calculate video throughput. We first tried packet-by-packet throughput by dividing the size of each packet by the time difference between its timestamp and the previous packet’s timestamp. However, the plot that this method produced did not match our expectations because we realized that there were time periods within each second where no packets were streamed. Thus, we decided to calculate throughput by bucketizing all packets based on the second in which they were received. This methodology proved a good choice when we ran the no-competing-flow experiment, and it produced plots showing that the Netflix video maintained close to maximum throughput of 2.5Mb/s as expected.
Our final challenge was automation. We previously captured PCAP files manually using Wireshark, a process which required meticulous timing and coordination between playing the Netflix video, activating a competing flow, and starting packet capture. After finalizing our methodology and producing satisfactory results, we devoted our time to writing a run.sh script that encompasses the entire experiment, including the setup process, capturing packets, and plotting. We made the user interface intuitive and only required users to adjust video playback in Netflix, thereby minimizing the possibility of experimentation errors.
Critique of Original Research
We started out with the thesis that Netflix has eliminated the downward spiral effect for their video streams by adopting recommendations proposed in the paper we read. Our experimental results support this thesis because we observe that Netflix video streams maintain throughput close to their maximum allocated bandwidth even in the presence of a competing flow. Thus, this proves that the downward spiral effect was indeed an issue that Netflix actively addressed and eventually resolved with the help of TY and her colleagues.
Extensions to Original Research
As a test of the robustness of Netflix’s implementation fixes, we also performed our experiment with 4 competing flows instead of 1 to see if Netflix can still prevent the downward spiral effect in the presence of multiple competing flows. We cannot confirm the validity of our results because this setup requires a bandwidth of 12.5Mb/s so that each flow gets a share of 2.5Mb/s like before, but we do not know if our computer’s internet connection consistently achieved that rate over the course of the experiment. However, our preliminary results produced a CDF plot similar to our with-competing-flow plot above, where the video stream was able to maintain throughput close to its allocated maximum bandwidth. Therefore, we believe that the original paper proposed robust solutions that can scale to multiple competing flows, as demonstrated by Netflix’s video streaming performance after adopting their recommendations.
Platform & Dependencies
We ran our experiments on a Macintosh computer because Macs come with a default system facility called DummyNet that allows us to set a bottleneck bandwidth on our computers. We used tcpdump to capture network packet data and chose python as our main data processing language because of the wide availability of libraries that can help us parse PCAP files and plot throughput graphs.
The entire experiment is very reproducible and only requires a Macintosh computer with working internet connection, Chrome browser, and python installed. Our run.sh script will guide users through performing the rest of the setup process, such as downloading/installing required python libraries, installing a Chrome browser extension, and setting up Netflix.
Two parameters affect reproducibility the most. Firstly, we set bottleneck bandwidth to 2.5Mb/s for the no-competing-flow case and 5Mb/s for the with-competing-flow case, so the computer must have an internet connection at least above 5Mb/s. We believe this requirement is easily achievable on campus for any machine connected to Stanford’s network. Secondly, the user cannot be performing a large file download or streaming large amounts of data while performing our experiment, since these actions will also pose as competing flows. In other words, we must ensure that Netflix can see and use all of the bandwidth allocated to its video stream without disruptions from any network usage outside of the experiment.
Instructions for Reproducing Our Experiment
- OS: This experiment currently requires MAC OS X.
- Applications: Make sure you have Python and the Google Chrome browser installed on your computer.
- Internet Connection: Ensure that the computer has a working internet connection, preferably to the Stanford network.
- Constraints: Do not perform any data streaming or large file downloads during the experiment, which will take approximately 30 minutes to complete.
To begin the experiment, open Terminal, navigate to the project folder, and run the following command:
This script will guide you through the entire experiment, including setup, capturing packets, and plotting.