Home >  T.A >  CS179G ( Spring 2003 ) >  Projects >  Project Phase #4

  Project Phase #4 - Sort-Merge Join
due date: Friday 13 June 2003

 Lecture 1   ::    phase3.tar.gz   ::    Browse Files

1 Introduction

In this assignment, you will implement the sort-merge join algorithm on top of Minibase.

2 Getting Started
Please Download the files for Phase 4 from here phase4.tar.gz into your working directory.

  1. cd project
  2. tar -zxvf phase4.tar.gz
Now you will, as usual, see 3 generated directories:
  1. lib/
  2. include/
  3. src/
src/ contains the files you will be working on. If you cd src/ and then make the project*, it will create an executable named SortMerge . The included files are:
  • Makefile: A sample makefile to compile your project.
  • sortMerge.h: Specifications for the class sortMerge. You have to implement these specifications as part of the assignment.
  • SMJTester.C: sort-merge test driver program.
You will also find in the project directory the implentation of the external sort algorithm in the files sort.C and sort.h and the interface to the Scan and HeapFile classes in include/.

Sample output of a correct implementation is available in out.

* Also Make sure that you edit Makefile to reflect the MINIBASE variable.

3 Design Overview

class sortMerge {
  public:

   sortMerge(
     char       *filename1,    // Name of heapfile for relation R.
     int         len_in1,      // # of columns in R.
     AttrType    in1[],        // Array containing field types of R.
     // Array containing size of columns in R.
     short       t1_str_sizes[],
     int         join_col_in1, // The join column number of R.
     char       *filename2,    // Name of heapfile for relation S
     int         len_in2,      // # of columns in S.
     AttrType    in2[],        // Array containing field types of S.
     // Array containing size of columns in S.
     short       t2_str_sizes[],
     int         join_col_in2, // The join column number of S.
     char*       filename3,    // Name of heapfile for merged results
     int         amt_of_mem,   // No of pages available for sorting
     TupleOrder  order,        // Sort order: Ascending or Descending
     Status&     s             // Status of constructor

 );
   ~sortMerge();
};
The sortMerge constructor joins two relations R and S, represented by the heapfiles filename1 and filename2, respectively, using the sort-merge join algorithm. Note that the columns for relation R (S) are numbered from 0 to len_in1 - 1 (len_in2 - 1). You are to concatenate each matching pair of records and write it into the heapfile filename3. The error layer for the sortMerge class is JOINS.

You will need to use the following classes which are given: Sort, HeapFile, and Scan. You will call the Sort constructor to sort the input heapfiles (which means your primary responsibility will be to implement the merging phase of the algorithm). To compare the join columns of two tuples, you will call the function tupleCmp (declared in sort.h). Once a scan is opened on a heapfile, the scan cursor can be positioned to any record within the heapfile calling the Scan method position with an RID argument. The next call to the Scan method getNext will proceed from the new cursor position.

What to Turn In

You are required to turn in your copy of all source files through the online secure site https://www.cs.ucr.edu, this includes all the files needed to make this phase and run the test program. The TAs will combine your source files with a template directory type make and run the program.

Please remember late submissions will not be accepted. Make sure to start early!



Top


sitemaster: Demetris Zeinalipour