In this assignment, you will implement the sort-merge join algorithm on top of Minibase.
2 Getting Started
Please Download the files for Phase 4 from here phase4.tar.gz into your working directory.
cd project
tar -zxvf phase4.tar.gz
Now you will, as usual, see 3 generated directories:
lib/
include/
src/
src/ contains the files you will be working on. If you cd src/ and then make the project*, it will create an executable named SortMerge .
The included files are:
Makefile: A sample makefile to compile your project.
sortMerge.h: Specifications for the class sortMerge. You have to
implement these specifications as part of the assignment.
SMJTester.C: sort-merge test driver program.
You will also find in the project directory the implentation of the external
sort algorithm in the files sort.C and sort.h and the interface
to the Scan and HeapFile classes in include/.
Sample output of a correct implementation is available in out.
* Also Make sure that you edit Makefile to reflect the MINIBASE variable.
3 Design Overview
class sortMerge {
public:
sortMerge(
char *filename1, // Name of heapfile for relation R.
int len_in1, // # of columns in R.
AttrType in1[], // Array containing field types of R.
// Array containing size of columns in R.
short t1_str_sizes[],
int join_col_in1, // The join column number of R.
char *filename2, // Name of heapfile for relation S
int len_in2, // # of columns in S.
AttrType in2[], // Array containing field types of S.
// Array containing size of columns in S.
short t2_str_sizes[],
int join_col_in2, // The join column number of S.
char* filename3, // Name of heapfile for merged results
int amt_of_mem, // No of pages available for sorting
TupleOrder order, // Sort order: Ascending or Descending
Status& s // Status of constructor
);
~sortMerge();
};
The sortMerge constructor joins two relations R and S, represented by the
heapfiles filename1 and filename2, respectively, using the
sort-merge join algorithm. Note that the columns for relation R (S) are
numbered from 0 to len_in1 - 1 (len_in2 - 1). You are to
concatenate each matching pair of records and write it into the heapfile
filename3.
The error layer for the sortMerge class is JOINS.
You will need to use the following classes which are given: Sort, HeapFile,
and Scan. You will call the Sort constructor to sort the input heapfiles
(which means your primary responsibility will be to implement the merging
phase of the algorithm). To compare the join columns of two tuples, you
will call the function tupleCmp (declared in sort.h). Once a scan
is opened on a heapfile, the scan cursor can be positioned to any record
within the heapfile calling the Scan method position with an RID
argument. The next call to the Scan method getNext will proceed
from the new cursor position.
What to Turn In
You are required to turn in your copy of all source files through the online secure site https://www.cs.ucr.edu, this includes all the files needed to make this phase and
run the test program. The TAs will combine your source files with a template directory
type make and run the program.
Please remember late submissions will not be accepted. Make sure to
start early!