Peerware

A Real P2P Information Retrieval Testbed

 [  Overview ::  Download ::  Publications ::  Outline ::  People ]
   

Overview
Peerware is a the prototype of a Real Peer-to-Peer Information Retrieval system which can be deployed on a Network of Workstations (e.g. Instructional Labs, Clusters, etc). Peerware allows you to validate various ideas and algorithms in a real setting. It consists of the following modules:

Modules
  1. A P2P environment written in JAVA (Java Middleware)
  2. The Lucene Information Retrieval IR API used by each peer to locally lookup queries
  3. A set of shell scripts which can be used to deploy the Java Middleware.
Peerware Modules can be extended in order to simulate different realistic environments. For example somebody can substitute the local IR index with a Postgres database and perform SQL queries. Or somebody can replace the random graph generator by a powerlaw graph generator and still have then rest of the system unaffected. Peerware has been tested with 1000 peers over a LAN of 75 machines (in 3 subnets), but in practice it could also be deployed in much larger environments (e.g. WAN).
Peerware currently supports popular query routing algorithms such as BFS flooding, RBFS, ISM and >RES. For details please refer to the overview in Elsevier IS Journal paper (see publications).
You can also add synthetic failures (churn), perform bulk queries (automatically query a whole list of queries in Lucene Query Syntax (Boolean Queries, Fuzzy Searches, Proximity Searches, Range Searches, Term Boosting, Wildcard Searches etc)

Download Now!

Source Code (Linux / Cygwin)

Includes makefile which compiles the sytem +
Partial Classified Reuters-21578 Text Categorization Test Collection (based on country)

Send us an email here so that we can send you the password

FULL Dataset

Classified Reuters-21578 Text Categorization Test Collection (based on country)

Send us an email here so that we can send you the password

Manuals


Source Code
Documentation

(javadoc - in progress)

Outline of Operation
  • Add any xml collection to the rawdata/ folder
  • Initiate a script that indexes the data
  • Initiate a script that constructs a random graph (each index is one peer)
  • You may visualize and analyze the graphs using the Pajek Visualizer/Analyzer.
  • Initiate a script that probes the network for available machines
  • Start the P2P network / using various churn levels
  • Perform various queries/traceroutes
  • Manipulate performance parameters such as messages, query recall, query time.
The system has automated shell scripts for all operations and therefore you can launch an experiment on a large number of machines in as little as 30 seconds!

Related Publications
  • D. Zeinalipour-Yazti, V. Kalogeraki and D. Gunopulos, "Exploiting Locality for Scalable Information Retrieval in Peer-to-Peer Systems", Information Systems Journal, Elsevier Publications, Volume 30, Issue 4, Pages 277-298, 2005. (available in pdf)

  • D. Zeinalipour-Yazti, V. Kalogeraki and D. Gunopulos, "Information Retrieval Techniques for Peer-to-Peer Networks", IEEE CiSE Magazine, Special Issue on Web Engineering, IEEE Publications, pp.12-20., July/August 2004 (available in pdf).

  • D. Zeinalipour-Yazti, V. Kalogeraki and D. Gunopulos, "On Constructing Internet-Scale P2P Information Retrieval Systems", Intl. Workshop On Databases, Information Systems and P2P Computing DBISP2P (co-located with VLDB'2004), LNCS 3367, pp. 136-150, Toronto, Canada, 2004. (available in pdf). Presentation available in ppt.

  • V. Kalogeraki (HP Labs), D. Gunopulos (UCR) and D. Zeinalipour-Yazti (UCR), "A Local Search Mechanism for Peer-to-Peer Networks, "11th International Conference on Information and Knowledge Management (CIKM'2002) , McLean, Virginia USA, November 4-9, 2002 (available in pdf). Conference Presentation (available in ppt)


  • D. Zeinalipour-Yazti, "Exploiting the Security Weaknesses of the Gnutella Protocol", Dept. of Computer Science, University of California, Riverside, May 2002. (Available in pdf format).

  • D. Zeinalipour-Yazti and T. Folias " Quantitative Analysis of the Gnutella Network Traffic", TR-CS-89, Dept. of Computer Science, University of California, Riverside, June 2002. (Available in pdf format).

  • Top


    People
    Demetrios Zeinalipour-Yazti, PhD Candidate UCR
    Dimitrios Gunopulos, Associate Professor UCR
    Vana Kalogeraki, Assistant Professor UCR

                     
    Top