Team Manikanta Kotaru
Paper Te-Yuan Huang , Nikhil Handigol , Brandon Heller , Nick McKeown , Ramesh Johari, Confused, timid, and unstable: picking a video streaming rate is hard, Proceedings of the 2012 ACM conference on Internet measurement conference, November 14-16, 2012, Boston, Massachusetts, USA – which is from hereby referred to as ‘the paper’.
Commercial video streaming services like Netflix and Hulu gained popularity in recent years. As suggested in the paper, video streaming services account for more than 50 percent of the download traffic during peak hours. These statistics reveal the presence of huge consumer base and user Quality of Experience is the prime factor for the services to sustain in the business. A video of better quality is more pleasing to eyes. However, a high video quality demands more data to be transferred for the same duration of video, and hence a higher bit-rate. If the network cannot sustain the required video bit-rate, buffering events occur which degrades user experience. The paper observes an interesting phenomenon where, in the presence of a competing flow, the video bit-rate drops much lower than what the network can sustain. The paper describes this as “downward spiral effect”. The paper makes an attempt to understand this phenomenon which thereby helps create solutions to improve user Quality of Experience. The paper further describes controlled experiments and client side modifications to validate the proposed reasoning.
The goal of the video streaming service is to provide the highest possible video bit-rate that can be sustained by the network. The content of these services is hosted on commercial third-party CDNs which stream over HTTP. This forces the service provider to apply rate selection algorithms, to choose the ‘right’ bit rate, on client side. The current algorithms for rate selection are based on the bandwidth available, which is estimated using the observed throughput of data transfer over HTTP. The paper explains that the TCP congestion control dynamics in the presence of competing flows make the observed throughput a bad estimate of available bandwidth. This coupled with the conservative rate selection (by conservative, the paper means choosing a bit-rate less than the estimated throughput) results in the client choosing a video bit rate which is extremely suboptimal.
Results from the paper
We provide a brief overview of key results in the paper to set the context and point the reader to the paper for more details. Figure 4(a) in the paper (Figure 1 below) illustrates how the video bit-rate chosen by the client rapidly reduces to the lowest possible value in the presence of a competing TCP flow. One can observe that in the absence of competing flow, the client chooses the best video playback rate possible. However, in the presence of a competing flow, the playback rate rapidly drops to lowest possible value. Ideally, the two flows should share the 5Mbps capacity of bottleneck link equally as expected from TCP behavior. With 2.5Mbps equal share bandwidth available, the streaming service would sustain the playback rate of 1750 kbps. Instead, the playback rate drops to 235 kbps.
The paper suggests that this effect occurs because of the following cycle of events. There is a playback buffer at the client side which can hold at-most 240 seconds of video (for legal reasons). The client requests for 4 second video segments from the server whenever enough playback buffer space is available. If size of the flow (in terms of size of the file requested for download) is short enough so that TCP flows do not reach equilibrium, then the flows do not share the bottleneck bandwidth equally. So, the throughput perceived by the client is less than the equal share bandwidth for smaller file sizes. The conservative rate selection algorithm results in selecting lower video bit-rate than the perceived throughput. Lower video bit-rate corresponds to requesting file of even smaller size further worsening the throughput perceived by the client. And the cycle continues.
The paper summarizes the main factors contributing to the above cycle of events as: 1) Conservative rate selection resulting in requesting for video segments of smaller size. 2) The behavior of the client which aims to fill the playback buffer. The client now has to wait till enough playback buffer space is available before requesting new file. The wait (TCP inactivity) results in congestion window time out. The download of new segment has to now ramp up from slow-start which decreases perceived throughput. 3) Smaller segment size (4 seconds of video) leading to downloading files of smaller size.
1. Recreating figure 4(a) in the paper which essentially summarizes the problem faced by video streaming services: We create a general application with the same underlying characteristics video streaming service client and recreate the behavior observed in the paper. This subgoal helps in concluding that any application (not necessarily video streaming services) which has the described properties faces similar problem when it interacts with the network.
2. Investigating the buffer size at bottleneck link as an additional important factor contributing to downward spiral effect: The analysis of the paper mainly concentrated around size of the file being the main contributor for lower perceived throughput. Reducing the size of the buffer at bottleneck link allows TCP congestion control dynamics to kick in early leading to quick adjusting of the congestion windows of all the flows. Then, even for smaller file sizes, TCP reaches equilibrium bettering the throughput estimate. The buffer size,120 kbit, for measurements in the paper is large (RTT is 4 ms and bottleneck bandwidth is 5 Mbps leading to bandwidth delay product of 20 kbit). We reduce the buffer size and show significant improvement in video playback rate.
We created the following topography used in the paper and in CS 244 assignment 2, using Mininet.
The bottleneck bandwidth is 5 Mbps and RTT is 4 ms. Default buffer size is 120 kbit as used in the paper, but we vary the buffer size in later experiments. C is the server which acts as CDN which hosts the content. Host A acts like client of video streaming service ‘A’ described in the paper. We created the client using description given in the paper and data in Figure 9 of the paper (more about it in the implementation details). Host B provides the competing TCP flow by establishing long-lived (400 second long) iperf flow with B as iperf server and C as the iperf client. The competing flow starts 400 seconds after the client (host A) established TCP connection with server (C). The following plot illustrates the video playback rates chosen by the client over time.
Result in the paper:
Figure 2 is recreated plot of figure 4(a) in the paper using the experimental setup described above.
As described in the paper, we see a rapid drop in the video playback rate chosen by the client after the start of competing TCP flow. We observe that the video playback rate falls to 560 kbps in the presence of competing flow and does not go all the way to the 235 kbps as observed in the paper (This may be due to other implementation details of the streaming service that are not mentioned in the paper. Also, the files are compressed at the server in a video streaming service which we do not do in our experiments). However, clearly the throughput perceived is much below the equal-share bandwidth. This result supports the arguments in the paper that the characteristics of the application described in the paper are sufficient to explain the phenomenon (as we are able to recreate the phenomenon with just these details) and other implementation details of the streaming service play a less significant role, if at all.
Experimentation with buffer size
Now we vary the buffersize at bottleneck link and the results are presented in Figure 3. As we can observe, the worst video playback rate improved from 0.56 Mbps to 1.05 Mbps just by decreasing the buffer size from 120 kbit to 75 kbit. With buffer size of 20 kbit (which is 1/6th of the buffer size used in the paper), the bottleneck sharing is reasonably fair.
Experimenting with arbitrary playback rate
Conservative rate selection leads to requesting for files of smaller size. The effect is even worsened due to discrete set of playback rates available. The discrete set is mainly for simplicity and speed. We can code the video segment on the fly so that the server can support requests for non-discrete playback rate. However, it is an expensive process in terms of both time and computational complexity. Also, it is not profitable for CDNs to provide coding because they host content for different kinds of services and coding is not general enough requirement for all of these services. So, it is not done in commercial video streaming services. However, for the sake of decoupling the effect due to discrete video playback rate from other effects described in the paper, we perform an experiment where the client can request arbitrary video playback rate. The rate selection is still conservative in the sense that it requests for playback rate which is a fraction of the perceived throughput but now it need not look for the next smallest video playback rate among the discrete set available. The results are displayed in Figure 4.
As expected, we observe improvement from discrete case. However, more interesting results are obtained for 1.0 conservatism where we request playback rate equal to the perceived throughput. The results are displayed in Figure 5.
As we reduce the buffer size the bandwidth sharing is more fair. And when buffer size is 20 kbit, the bandwidth is infact shared equally between both the flows. Inspite of all the factors mentioned in the paper still playing role in this experiment, smaller buffer size and non-discrete playback rate resulted in equal sharing of bottleneck bandwidth.
What will be the impact of these results on the design at the client side?
The paper describes client side modifications to mitigate partially the downward spiral effect. However, since the buffer size at bottleneck link is not in the control of service provider, there is nothing much it can do to combat the problem. Also, non-discrete video playback rate is complex to implement. However, these results show that the understanding of the paper is incomplete. These two factors play a significant role apart from other characteristics mentioned in the paper and just correcting for these factors, with other factors still in place, resulted in equal sharing of bandwidth.
1) The playback buffer at the client can at most hold 240 seconds. The buffer is constantly drained out at rate 1 i.e., one second of data is drained out every second. The buffer occupancy is increased by the video segment size (in seconds) downloaded. The playback buffer occupancy can be obtained from these values measured over time.
2) A video segment of 4 seconds is requested when enough space is available in the buffer
3) The paper suggests that there is a separate file for each possible playback rate in the CDNs. But it will not be possible to support non-discrete playback rate requests using such system, as required by our experiments. So, an equivalent situation is created by having the client request HTTP byte range requests of size equal to the product of video playback rate and video segment size (4 seconds). Hence, the client can request for file sizes in granularity of bytes rather from discrete set of file sizes available. A possible real implementation would be to have a single file at the server and code it, on the fly, to different playback rates depending on the rate requested by the client.
4) An actual client uses persistent HTTP connection i.e., same TCP connection for all the requests. It was hard to make multiple ‘cURL’ requests over the same TCP connection. So, for our experiments, a new connection is established for each video segment download. This results in congestion window starting from a smaller initial window size (initial window size of 4) during slow start phase compared to persistent HTTP connection (initial window size of 10).
5) The perceived throughput is calculated by dividing the video segment file size (in bytes) by the time taken to fetch it. The playback rate is chosen based on the average throughput for the last 10 fetches. The video playback rate that was chosen by an actual client is obtained from figure 9 in the paper. The figure displays the relation between perceived throughput and requested playback rate.
6) Throughput of the competing flow is obtained from periodic bandwidth measurements provided by iperf tool.
7) Mininet is a simple platform and is sufficient for reasonably accurate measurement results as we are not dealing with very high speed links.
8) To smooth the throughput displayed in graph, a moving average filter of length 5 is operated on the actual data.
README: Setting up and running the experiment
1. Clone to git repository using command:
git clone https://firstname.lastname@example.org/mkotaru/cs244_2014_pa3.git
2. Create EC2 c3.large instance following the instructions on CS244 course webpage http://www.stanford.edu/class/cs244/ec2.html
3. Export the code to the instance.
4. On executing run.sh, default experiment is conducted according to the parameters in the paper (results in Figure 2). However, you can play with the setup by adding optional parameters as described in README file in git repository.
I thank Te-Yuan Huang for providing the data in Figure 9 of the paper which presents the relation between video playback rate a commercial client requests and the perceived throughput.