We examine Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks , one of the most highly cited papers of the BItTorrent protocol. The major contribution in the paper was showing how the influx and outflow of seeds and peers modelled the behavior of a simple fluid model. The authors conducted the first experiments under a ‘simulated BitTorrent-like network’. We reproduce the experiment under a self-contained BitTorrent network emulated in Mininet. Because the paper’s experiments were in a simulation, we test whether running actual BitTorrent clients in an emulated network still follows the fluid model.
The original paper was attempting to develop a model to describe the performance, scalability, and efficiency of of BitTorrent, a Peer-to-Peer (P2P) protocol designed for file-sharing. In particular, they wanted to be able to develop a model to categorize the number of seeders and peers in a network for a particular torrent file, as this has implications on the performance of the file sharing.
This problem is important because BitTorrent was a radical change from existing file transfer protocols. Because BitTorrent is so different from previous approaches to file transfer, characterizing its efficiency and performance in different scenarios was key to understanding how well the system behaves, and how it compares to traditional file transfer. Although people had a general understanding that BitTorrent likely scales well and is efficient, a model that describes BitTorrent’s behavior would be quite useful for understanding both how the system behaves, and the strengths and weaknesses of the protocol for future research.
Results from Paper
The authors of the paper performed three experiments to validate their fluid model against the behaviors of the BitTorrent network.
A private BitTorrent-like network is simulated, with all the client nodes and tracker servers controlled fully by the authors. All network characteristics such as the bandwidth of each peer, rate of download aborts, and the influx/outflow of seeds/peers are determined by a Markov model, with the initial parameters left constant. The rate at which downloaders and seeders leave the system is the same in this experiment. The authors found that the normalized number of seeders and downloaders followed the behavior predicted by their simple fluid model very closely.
The second experiment is exactly the same as the first, except we increase the rate at which seeds leave the system. This results in the uploading bandwidth becoming the main bottleneck. The fluid model fits well for the results obtained from the first and second experiment, with the best results coming from simulations where the arrival rate of new peers is low.
A seed is a node with a fully downloaded file willing to share fragments with peers. The authors introduced a seed into a real life BitTorrent network and studied the influx/outflow of seeds and peers. The number of seeds in the network at any given time was characterized well by the fluid model. The number of downloaders in the network at any given time was characterized less well, but within the bounds suggested by the 95% confidence interval.
Subset Goal/Motivation and Extensions
Our goal is to see if an emulated (rather than simulated) BitTorrent network follows the fluid model as shown in experiments 1 and 2 in the paper. We chose these experiments to reproduce because Mininet allows us to achieve the real-world behavior of experiment 3 while maintaining control of the network as in experiments 1 and 2. We originally planned on directly reproducing experiment 3, but we realized that we would need to offer a very popular Torrent file on the internet, which is something that we don’t have. We also considered examining the tracker for such files on the internet, but it did not seem possible to get the complete server logs for these, which is what we would need to recreate the graph in experiment 3.
We chose to use Mininet on an EC2 machine for our experiments. We chose Mininet because we were familiar with the system, and it seemed relatively easy to set up a BitTorrent system inside Mininet. We chose EC2 because it allowed us to emulate larger topologies than our laptops could handle. We were able to rerun our experiments several times and see similar behavior. We think the power of the machine that is emulating the network will affect the reproducibility of our results. If the machine is not very powerful, then it may not be able to sustain the throughput required for our experiments. Similarly, if the machine experiences periods of heavy load, the emulation may suffer. Although we don’t believe that Mininet affects our results, it would be interesting to see if a different network emulation environment would cause any differences in experimental results.
Our experiment involved setting up a topology in Mininet, creating a tracker, and then launching multiple BitTorrent clients in the Mininet topology to emulate a real BitTorrent network. To pick our topology, we drew inspiration from the backbone network of the US. We wanted to model a network where there was a cluster of clients in San Francisco, Los Angeles, and Seattle.
Figure 1. US AT&T Backbone Network 
To model this scenario, we created 3 switches in Mininet, each representing a “city” or backbone node. We also made the delay of one of the links twice as long as the other, to model the increased distance between San Francisco and Seattle. A diagram of our setup is reproduced below.
Figure 2. Mininet Topology
We run our experiment until 200 unique clients in have entered the system. Like the authors, we model client arrivals as a Poisson process, and we model the departure of seeders and downloaders as a Poisson process as well. We do this by using an exponential distribution to find the interarrival/interdeparture times, respectively. We poll our tracker server every 5 seconds to obtain the current number of downloaders and seeders in the network at any given time. Then, we run the simple fluid model with the same parameters as our experiment and compare the results between it and our tracker.
While reading the paper, our biggest conceptual challenge was the type of experiment the authors performed. We initially thought that by “simulation of a BitTorrent-like network,” the authors meant they ran BitTorrent clients in a simulated network. However, after discussing the paper with Keith, we realized that the authors did not actually run BitTorrent clients, but rather simulated their behavior. A pleasant side-effect of this realization was that our experiment was somewhat novel. We were able to see if an emulated network with actual BitTorrent clients (which we could control) matched the fluid model, which is an experiment that is closer to the behavior of real BitTorrent networks.
The largest implementation challenge we had was finding a suitable BitTorrent client. We needed a client that we could control via the command line, and that we could run multiple instances of in a single machine without any of them sharing state. Initially, we tried the Transmission client. However, we realized that because Transmission ran as a daemon, as each new client attempted to download the torrent, the Transmission daemon would see that the torrent was already downloaded and then refuse the download. As a result, we narrowed our search to clients that did not run as daemons. We then tried aria2c, which almost worked except that the client would not act as the ‘initial seeder.’ That is, it would not offer to seed the torrent without having downloaded it first, which the first peer in the network would have to do. We finally settled on ctorrent, which met all of our constraints.
Differences In Experiment Implementation from Paper
The authors were able to get 500-600 downloaders and seeders in their simulation. However, since we are running a BitTorrent network emulated in Mininet, our EC2 machine can only sustain ~15 downloaders and seeders at a time, each with 2Mb/s of bandwidth. The authors ran the experiment for 3.5 days, while we ran ours for about 40 minutes due to time constraints.
The first experiment involved setting the rate at which downloaders and seeders leave the system to be the same. We set the arrival rate λ=0.1, and the departure rates of both the seeders and downloaders to be 0.01 (all of our parameters are in terms of seconds). From their torrent logs, we measured each client to have a download speed of about 2Mb/s and an upload speed of about 0.4Mb/s. We shared a 30.4Mb torrent file among the peers. We polled the tracker for the number of seeders and downloaders every 5 seconds. We reran the experiment multiple times, and present two runs below:
Figure 3. Experiment 1 Run 1
Figure 4. Experiment 1 Run 2
As you can see, while the fluid model follows the seeders moderately well, it has more difficulty modelling the number of downloaders, similar to the paper’s results in Experiment 3. In particular, we found that the fluid model consistently underestimated the amount of downloaders in the system. We noticed that spikes in downloaders seemed to correspond with drops in seeders, and after these spikes, the system struggled to recover (for example, at 1000 seconds in Run 1). One hypothesis we had for these spikes was the the burstiness of the arrival and departure processes. To test this, we created an experiment where new downloaders arrived at a regular periodic rate instead of as a poisson process, and likewise for departures. However, we still saw similar behavior:
Figure 5. Periodic Arrivals and Departures
This leads us to believe that these spikes are due to the variance in network performance. We hypothesize that these spikes are due to either variances in the performance of our EC2 virtual machine, variances in Mininet, or variance in the BitTorrent protocol itself that the fluid model does not account for.
The second experiment differs from the first simply by increasing the seed departure to 0.05 from 0.01.
Figure 6. Experiment 2
Like in Experiment 1, the Fluid Model predicts seeders more closely than it does downloaders, and the number of downloaders increases rapidly when the number of seeders is low (see the period between 1250-1500 seconds). Note however that the fluid model does predict a lower number of seeders in this experiment.
Fluid Model Predictions
In both our experiments, we found that the fluid model predicted the number of seeders well, even in the face of high variability. However, after the first large drop of seeders, the fluid model fails to predict the number of downloaders well.
A large drop in seeders in the network results in a large increase of downloaders. The aggregate seed upload rate decreases, leading to downloaders staying in the system longer. Unfortunately, the fluid model is not sensitive to these sudden changes, and this leads to a large prediction error. Its downloader predictions remain constant, when in reality, it should increase. The author’s simulated BitTorrent system shows the fluid model fitting a lot better because there is little variability in the number of seeders and downloaders. This allows their network to reach a steady state characterized by the fluid model.
Comparison with Third Experiment
In their 3rd experiment, the authors ran their fluid model against an unpopular real world torrent. We notice that the variance in the number of seeds and downloaders match the variance observed in the emulated BitTorrent networks we ran. We also notice that the fluid model suffers from the same insensitivity of the variance. In particular, we bring attention to the first large drop in seeders at t = 1000. If we leave the fluid model unchanged, it would not have predicted the subsequent number of downloaders very well. However, for time between 800min and 1300min, the authors let λ and γ change linearly, justifying their choice by citing their tracker logs. After the modification is made, their fluid model fits the data a lot better.
However, we see the same time-varying behavior in our experiments. This is in spite of the fact that our experiments hold λ and γ parameters constant, since we controlled precisely when downloaders arrived and when seeders left. We therefore wonder if there is some other time-varying factor missing from the fluid model, which may be likely as some subsequent studies have noted that the fluid model ignores several BitTorrent parameters [2, 3, 4].
In conclusion, we reproduced experiments 1 and 2 from the paper, except that we ran our experiment on an emulated BitTorrent network on Mininet. We observed results that were closer to the author’s findings in experiment 3, where they ran the fluid model against a real world BitTorrent network. We believe that there is a discrepancy in results because the fluid model is not very sensitive to variability in seeders/downloaders, leading to high prediction errors in systems where there are large fluctuations. The author’s simulated results contains low variability, whereas our Mininet BitTorrent network and the real world BitTorrent network has significantly higher variability. We hypothesize that the variability is a result of either variability in the performance in EC2, Mininet, or the BitTorrent efficiency itself. The authors were able to get their fluid model to match their real-world trace by time-varying certain parameters. However, even though we controlled these parameters (and held them as constant) in our experiments, we were not able to match our observed downloaders with what the fluid model predicts. Therefore, we wonder if there is a separate time-varying parameter missing from the fluid model, as we found several follow-up papers mention deficiencies in the fluid model [2, 3, 4].
Reproducing our Results
To set up the EC2 machine, launch our community AMI. This can be done by going to the AWS website, setting the region to ‘Oregon’, then choosing Launch Instance. When it asks you to choose the AMI, navigate to the Community AMIs tab on the left scrollbar and search for AMI id ami-4cc23b2c. Select the first result that appears – it should be called ‘Bittorrent2’.
Configure the ports:
Log in to the machine:
ssh -X -i <path_to_key> ubuntu@<public_ip>
Go to the experiments directory:
To see the graphs, you should enable X-forwarding when ssh’ing in to the EC2 machine. Or you can just download the resulting png files (seeders.png and downloaders.png). Actually running our experiments is simple: Just run:
Once you do this, Mininet will create a Topology of 200 clients; THIS CAN TAKE A WHILE. Please be patient while it does so. Then, it will run the experiment until all clients have arrived into the network. This will take approximately 30-40 minutes. You can see the progress so far in your stdout. You can also poll the current number of downloaders and seeds in the network by running
tail -f results/stats
Once the experiment ends, it will run the fluid model, and create seeders.png and downloaders.png. You can view them (if you have X-forwarding enabled) by using:
And that’s it!
 Qiu, Dongyu, and Rayadurgam Srikant. “Modeling and performance analysis of BitTorrent-like peer-to-peer networks.” ACM SIGCOMM computer communication review 34.4 (2004): 367-378.
 Lua, Eng Keong, et al. “A survey and comparison of peer-to-peer overlay network schemes.” Communications Surveys & Tutorials, IEEE 7.2 (2005): 72-93.
 Eger, Kolja, et al. “Efficient simulation of large-scale p2p networks: packet-level vs. flow-level simulations.” Proceedings of the second workshop on Use of P2P, GRID and agents for the development of content networks. ACM, 2007.
 Legout, Arnaud, Guillaume Urvoy-Keller, and Pietro Michiardi. “Understanding bittorrent: An experimental perspective.” (2005): 16.