A Quantitative Analysis of the Gnutella Network Traffic

Demetrios Zeinalipour-Yazti Theodoros Folias
Department of Computer Science
University of California - Riverside, CA 92507, USA
 [  Abstract  ::   Paper  ::   Presentation  ::   Reading & Resources ]
Print this Page   

Instructor: Michalis Faloutsos
CS204 - Course Webpage: http://www.cs.ucr.edu/~michalis/COURSES/204-02/204.html
Important Dates: Project proposal : Thu 25 April 2002.
 Literature Survey: Thu 9 May 2002.
 Project presentation: Thu 13 - Fri 14 June 2002.
 Final project paper: Mon 17 June 2002.
Peer-to-Peer (P2P) file-sharing systems such as Gnutella, Morpheus and Freenet have recently attracted a lot of interest from the internet community because they realized a distributed infrastructure for sharing files. Such systems have shifted the Web's Client-Server model paradigm into a Client-Client model. The tremendous success of such systems has proven that purely distributed search systems are feasible and that they may change the way we interact on the Internet. P2P systems uncover many new exciting features such as robustness, scalability and high fault tolerance but with a price. Most research concentrates on optimizing the communication and data model of such systems but inadequate work has been done in area of analyzing such systems. Most approaches tend to use as their basis simulation models which can lead to wrong observations and solutions. In this project we investigate the behavior of the Gnutella system by analyzing large log traces that we have obtained with gnuDC, our Distributed Gnutella Crawler. We describe gnuDC design and implementation choices and we then describe its architecture. We make an analysis of 56 million messages that we obtained with 17 workstations in a 5 hour interval. We have also done an extensive analysis on IP addresses observed in the gnutella network. We believe that our study will facilitate the design of new more efficient communication algorithms between peers.
Download Now!

Gnutella IP/DNS Dataset

This is a real dataset of IPs/DNSes obtained from the Gutella on June 1st, 2002.
Send us an email here to receive the password
(please mention University + Project Name)

# Sample from Trace
n ACAFBF76.ipt aol com
n ACC085DF.ipt aol com
n AC9FA848.ipt aol com
n AC8BE4F5.ipt aol com
Gnutella Query Dataset

This is a real dataset of Gnutella queries obtained from the Gutella network on June 1st, 2002.
Send us an email here to receive the password
(please mention University + Project Name)

# Sample from Trace
#Timestamp, Query hash, Hops, Query
1022923082988 1034225212 6 basketball mp3
1022923082992 1353386410 6 07 schiller zeitgeist
1022923082995 1087244513 6 sepultura mp3
1022923082997 1055852157 6 karaoke mp3
1022923083000 1588070977 6 de palmas mp3
1022923085134 1218447459 5 photoshop 7 zip
Interested in Using a real P2P Information Retrieval Testbed? Download Peerware!


Download Paper
Adobe Acrobat PDF ( 353 KB ) Zipped Postscript, ( 255 KB )
HTML (latex2html version)

Download Presentation
Powerpoint ppt ( 1.01 MB )

Reading & Resources