|
|
|
|
Peer-to-Peer (P2P) file-sharing systems such as Gnutella, Morpheus and Freenet have recently attracted a lot of interest from the internet community because they realized a distributed infrastructure for sharing files. Such systems have shifted the Web's Client-Server model paradigm into a Client-Client model. The tremendous success of such systems has proven that purely distributed search systems are feasible and that they may change the way we interact on the Internet. P2P systems uncover many new exciting features such as robustness, scalability and high fault tolerance but with a price. Most research concentrates on optimizing the communication and data model of such systems but inadequate work has been done in area of analyzing such systems. Most approaches tend to use as their basis simulation models which can lead to wrong observations and solutions.
In this project we investigate the behavior of the Gnutella system by analyzing large log traces that we have obtained with gnuDC, our Distributed Gnutella Crawler. We describe gnuDC design and implementation choices and we then describe its architecture. We make an analysis of 56 million messages that we obtained with 17 workstations in a 5 hour interval. We have also done an extensive analysis on IP addresses observed in the gnutella network. We believe that our study will facilitate the design of new more efficient communication algorithms between peers.
|
|
Gnutella IP/DNS Dataset
This is a real dataset of IPs/DNSes obtained from the Gutella on June 1st, 2002.
|
Send us an email here to receive the password (please mention University + Project Name)
|
|
# Sample from Trace
n 172.175.191.118 ACAFBF76.ipt aol com
n 172.192.133.223 ACC085DF.ipt aol com
n 172.159.168.72 AC9FA848.ipt aol com
n 172.139.228.245 AC8BE4F5.ipt aol com
|
NEW!
Gnutella Query Dataset
This is a real dataset of Gnutella queries obtained from the Gutella network on June 1st, 2002.
|
Send us an email here to receive the password (please mention University + Project Name)
|
|
# Sample from Trace
#Timestamp, Query hash, Hops, Query
1022923082988 1034225212 6 basketball mp3
1022923082992 1353386410 6 07 schiller zeitgeist
1022923082995 1087244513 6 sepultura mp3
1022923082997 1055852157 6 karaoke mp3
1022923083000 1588070977 6 de palmas mp3
1022923085134 1218447459 5 photoshop 7 zip
|
Interested in Using a real P2P Information Retrieval Testbed?
Download Peerware!
|
|
|
Top
|
|
Top
|
|
Top
|
- General P2P Papers & Specifications
- "The Gnutella Protocol Specification v0.41" - Document Revision 1.2.
- Peer-to-Peer Computing,
Milojicic, Dejan S.; Kalogeraki, Vana; Lukose, Rajan; Nagaraja, Kiran; Pruyne, Jim; Richard, Bruno; Rollins, Sami; Xu, Zhichen , HPL-2002-57, HP Labs. 2002
- Related Projects & Papers
- Modeling Large-scale Peer-to-Peer Networks, Mihajlo A. Jovanovic, Laboratory for Networks and Applied Graph Theory.
- The Anthill Project, Gnutella Monitoring, Department of Computer Science, University of Bologna.
- Mapping the Gnutella Network: Macroscopic Properties of Large-Scale Peer-to-Peer Systems Matei Ripeanu, Ian Foster, University of Chichago
- Using Mobile Agents for Network Resource Discovery in Peer-to-Peer Networks Cameron Ross Dunne School of Computer Applications, Dublin City University, Dublin 9, Ireland.
- On Power-Law Relationships of the Internet Topology, Michalis Faloutsos, Petros Faloutsos, Christos Faloutsos, SIGCOMM 1999.
- Building P2P networks with good topological properties, Gopal Pandurangan, Prabhakar Raghavan, Eli Upfal.
- Analysis of the Traffic on the Gnutella Network", Kelsey Anderson, University of California, San Diego CSE222, Final Project, March 2001.
- Tracing a large-scale Peer to Peer System: an hour in the life of Gnutella. Evangelos P. Markatos, In the Proceedings of the CCGrid 2002: the second IEEE International Symposium on Cluster Computing and the Grid, May 2002, pages 65-74.
- JAVA - Open Source Clients
- Others
|
|