TU Delft
 
MultiProbe
Research
Parallel and Distributed Systems
EWI PDS MultiProbeResearch
 
 
 
 
 
 
 
 
Quick links
rationale intro results research publications references


Rationale
why and how is this work relevant?
Real-world IP applications such as Peer-to-Peer file sharing are now able to benefit from network and location awareness. It is therefore crucial to understand the relation between underlay and overlay networks and to characterize the behavior of real users with regard to the Internet. For this purpose, we have designed and implemented MultiProbe, a framework for large-scale P2P file sharing measurements. Using this framework, we have performed measurements of BitTorrent, which is currently the P2P file sharing network with the largest amount of Internet traffic. We analyze and correlate these measurements to provide new insights into the topology, the connectivity, and the path characteristics of the Internet parts underlying P2P networks, as well as to present unique information on the BitTorrent throughput and connectivity.




Introduction
why do we need MultiProbe and such measurements

The topology and the characteristics of the Internet have a large impact on the operation of Internet applications such as multicast overlays [1], web hosting [2], and Peer-to-Peer (P2P) file-sharing systems [3]. In order to understand and improve the performance of these applications, it is essential to evaluate the way the underlay network supports the application overlays. In this work we present a measurement framework that allows joint large-scale measurements of Internet and the BitTorrent P2P overlay, which is arguably the largest current Internet application.

Large P2P networks continuously have more than 1,000,000 users (Slyck.com lists size information for many large P2P networks), and P2P file-sharing networks account now for more than one third of the total Internet traffic [4,5]. Over the past couple of years, BitTorrent has become the largest P2P file-sharing network in terms of generated traffic [5]. We therefore target our study on correlating the the characteristics of BitTorrent and its Internet underlay. For this, we have designed and implemented MultiProbe, a large-scale P2P measurement framework.

With MultiProbe, we build on the experience accumulated in our previous work [6], where we have investigated the high level characteristics and the user behavior of BitTorrent. Here, we complement our previous work by focusing on the following research topics:

  • Measuring underlay/overlay networks
    How to build a large-scale distributed infrastructure that can measure at the same time P2P and Internet characteristics? How to select a small part of the P2P network such that the measurement results are meaningful?
     
  • Characterizing overlay networks and their users
    Where are the overlay network users located? What is the geographical distribution of traffic? What is the connectivity among the users? What is the application throughput?
     
  • Correlating underlay/overlay measurements
    What is the topology, and what are the characteristics of those parts of the Internet that act as the BitTorrent underlay? What is the size of BitTorrent, for various Internet metrics? What are the ports on which BitTorrent traffic is present?
     

The main contribution of our work is that we present a correlated view of the dominant overlay network and the Internet, based on measurements taken in May 2005. We show evidence that the majority of BitTorrent users are located in Europe but, at the same time, that BitTorrent traffic is globally spread. We correlate the users' geographical locations with the generated traffic. Our results confirm that BitTorrent users are well connected by the Internet in terms of latency, hop count, and bandwidth. Finally, we find that while standard ports are still regularly used, the highest traffic volume is generated on non-standard ports.

Our work is somewhat similar with the work in [6-9]. One of the first studies regarding the characterization of P2P
file-sharing networks is the work of Saroiu et al. [7]. They characterize the one-point-to-target latency, the bottleneck bandwidth, the user connection/disconnection frequency, and the number of files in the network, and correlate this data. However, their results suffer from the vantage point effect (bias because of the measurement infrastructure postioning), and are representative for types of P2P networks different than the second-generation P2P network BitTorrent, i.e., first-generation P2P networks Napster and Gnutella. A comprehensive model of the KaZaA P2P network is presented in [8]. Both studies have a P2P-modeling perspective, and do not attempt to also characterize the underlying network. In [9], a 5-month trace of a single file shared using the BitTorrent protocol is presented. The file comes from the operating system domain, thus not being representative for P2P, where users download mostly movies and music. Their results need confirmation from a boarder study, such as this. In [10], the authors detail traffic
characteristics for three popular P2P file-sharing networks--FastTrack, Gnutella, and DirectConnect. The traffic information is collected from the border routers of a single, though large-scale, ISP. Due to router limitations in generating traffic statistics, standard ports are used to identify P2P traffic. In this work we show that such an approach would be ineffective for the case of
BitTorrent, because most of the traffic and at least 50% of the users are exchanging TCP packets over non-standard (for Unix and for BitTorrent) ports. A
comprehensive study of BitTorrent's characteristics has been performed by three of the authors of this work in
[6]. Their work investigates with the high level characteristics the user behavior of BitTorrent. This paper
adds the needed peer and network level insights, being therefore the natural complement of this previous analysis.

The MultiProbe framework is described on the MultiProbe page. The measurements are discussed in detail on the measurements page. On this page we focus on briefly presenting the results of these measurements, as well as on commenting some of our main findings.


top

Results
brief description

Statistics of the size of our measurements, BitTorrent, May 2005
Table MP-3. Statistics of the size of our measurements.

The percentages of users per continent and country, BitTorrent, May 2005
Table MP-4. The percentages of users per continent and country.


The Internet Organizations location, BitTorrent, May 2005
Figure MP-3. The Internet Organizations location: (a) CDF of number of users; (b) CDF of users' weights. The horizontal axis shows the rank of Internet Organizations, with respect to their percentage.


The Autonomous Systems location, BitTorrent, May 2005
Figure MP-4. The Autonomous Systems location: (a) CDF of number of users; (b) CDF of users' weights. The horizontal axis shows the rank of Autonomous Systems, with respect to their percentage.


The distribution of the IP path hop count, BitTorrent, May 2005
Table MP-5. The distribution of IP path hops for paths between PlanetLab peers and overlay network peers which have contacted them.

The distribution of AS traversals, BitTorrent, May 2005
Table MP-6. The distribution of AS traversals for paths between PlanetLab peers and overlay network peers which have contacted them.


The distribution of intra-AS path hop counts, BitTorrent, May 2005
Table MP-7. The distribution of intra-AS path hop counts for paths between PlanetLab peers and overlay network peers which have contacted them.


The distribution of the measured RTT, BitTorrent, May 2005
Table MP-8. The distribution of the measured Round-Trip Time, per classes.


The distribution of measured RTT, BitTorrent, May 2005
Figure MP-5. The distribution of measured RTT: (a) detailed distribution and zoom to RTTs below 1.5 seconds; (b) CDF of RTT. The vertical axis shows the respective percentages of numbers of RTTs.


Application-level connectivity: incoming connections, BitTorrent, May 2005
Figure MP-6. Number of incoming connections for a 5 minutes time unit.


The distribution of users and user weights per TCP port, BitTorrent, May 2005
Table MP-9. The distribution of users and user weights per TCP port.


Application-level bandwidth, BitTorrent, May 2005
Figure MP-7. Application-level bandwidth: (a) number of users per bandwidth value; (b) average bandwidth per swarm size.

Detailed explanations of the meaning of this section's graphs can be found in our publications. Overall conclusions about these results are included in the research section, on this page.


Research
what did we do, after all?
We have presented a correlated view of overlay networks and the Internet. For that purpose, we have designed, implemented, and deployed MultiProbe, a large-scale P2P measurement framework. Large-scale joint measurements of BitTorrent and the Internet were conducted in May 2005, and correlated into comprehensive statistical data, in four categories: location, route, connectivity, and traffic.

The main new results are: (1) the majority of BitTorrent users are located in Europe but, at the same time, BitTorrent traffic is globally spread; (2) BitTorrent peers are well connected by the Internet in terms of latency, hop count, and bandwidth; (3) BitTorrent peer connectivity does not depend on the swarm size; (4) BitTorrrent has shifted in the last year from static to random TCP port selection.

For the future, we plan to enable our framework to measure several other large-scale P2P networks, and to repeat our experiments in the new context.






Publications, conferences, talks
validating our work...
A.Iosup, P.Garbacki, J.A.Pouwelse, and D.H.J.Epema, Correlating Topology and Path Characteristics of Overlay Networks and the Internet, (submitted for publication).
 
A.Iosup, P.Garbacki, J.A.Pouwelse, and D.H.J.Epema, Correlating Topology and Path Characteristics of Overlay Networks and the Internet, Technical Report 2005-002, PDS group/TU Delft, ISSN 1387-2109, October 2005.
report PDF [425KB]
 

References
these studies have enabled us to work on this project
  1. G. Phillips, S. Shenker and H. Tangmunarunkit, Scaling of multicast trees: comments on the Chuang-Sirbu scaling law, In Proceedings of the SIGCOMM'99, pp. 41--51, ACM Press, 1999.
  2. P. Krishnan, D. Raz and Y. Shavitt, The Cache Location Problem, In IEEE/ACM Transactions of Networking, Vol. 8(5), pp. 568--582, IEEE Press, 2000.
  3. B. Zhao and A. Joseph and J. Kubiatowicz, Locality aware mechanisms for large-scale networks, In Proceedings of Workshop on Future Directions in Distributed Computing (FuDiCo), pp. 229--238, Bertinoro, Italy, June 2002.
  4. T. Karagiannis, A. Broido, N. Brownlee, k. claffy, and M. Faloutsos. Is P2P dying or just hiding?. In Global Internet and Next Generation Networks (Globecom 2004), Dallas, Texas, US, Dec 2004.
  5. A. Parker, The True Picture of Peer-To-Peer File-Sharing, Panel Presentation, IEEE 10th International Workshop on Web Content Caching and Distribution, Sophia Antipolis, France, September 2005.
  6. J. Pouwelse, P. Garbacki, D. Epema, and H. Sips. The Bit-Torrent p2p file-sharing system: Measurements and analysis. In Proceedings of the 4th International Workshop on Peer-To-Peer Systems (IPTPS’05), Ithaca, New York, USA, February 2005.
  7. S. Saroiu, P. Gummadi, and S. Gribble. A measurement study of peer-to-peer file sharing systems. In Proceedings of Multimedia Computing and Networking, 2002K.
  8. P. Gummadi, R. J. Dunn, S. Saroiu, S. D. Gribble, H. M. Levy, and J. Zahorjan. Measurement, modeling, and analysis of a peer-to-peer file-sharing workload. In SOSP ’03: Proceedings of the nineteenth ACM symposium on Operating systems principles, pages 314–329. ACM Press, 2003.
  9. M. Izal, G. Urvoy-Keller, E. Biersack, P. Felber, A. Al Hamra, and L. Garces-Erice. Dissecting BitTorrent: Five months in a torrent’s lifetime. In Passive and Active Measurements (PAM 2004), April 2004.
  10. S. Sen and J. Wong. Analyzing peer-to-peer traffic across large networks. In Second Annual ACM Internet Measurement Workshop, November 2002.

 
     

Last modified: Thu, 13 October, 2005 6:33 PM
The newest version of this page can be found at: http://multiprobe.ewi.tudelft.nl/research.html
Copyright © 1998-2005 Alexandru Iosup. All Rights Reserved.
And the famous NedStat counter: