Genbank parsing: course project of cs234 --Qi Fu

I am a graduate student in Department of Computer Science and Engineering at the University of California, Riverside.

Description:

My Genbank parser takes text input file in Genbank format, and use the annotation to identify the starting and ending position of all the genes in the DNA sequence. Base on the parameter x, [-x,0] upstream sequence of each gene is returned. The possible overlaps with other genes are also reported. If the region [-x,0] overlaps with another coding sequence, the longest possible non-coding sequence is returned.