PuFFIN - A Parameter-free Method to Build Genome-wide Nucleosome Maps from Paired-end Sequencing Data
Anton Polishko*, Evelien M. Bunnik**, Karine Le Roch** and Stefano Lonardi**Department of Computer Science and Engineering, University of California, Riverside CA 92521
**Department of Cell Biology and Neuroscience, University of California, Riverside CA 92521
We introduce a novel method, called PuFFIN, to build genome-wide nucleosome maps specifically designed to take advantage of paired-end reads. The availability of paired-end reads enables our method to produce a higher number of detected nucleosomes. In contrast to other approaches that require users to optimize several parameters according to their data (e.g., the maximum allowed nucleosome overlap or legal ranges for the fragment sizes) our method can accurately determine a genome-wide set of non-overlapping nucleosomes without any user-defined parameter. This feature makes PuFFIN significantly easier to use and prevents users from choosing ``bad'' parameters and obtain suboptimal nucleosome maps. Here for the first time, we frame the problem of determining genome-wide nucleosome locations in a multi-scale (or multi-resolution) framework. Our algorithm builds a set of nucleosome "landscape functions" at different resolution level, in which each function represents the likelihood of a genomic location to be occupied by a nucleosome. After a set of candidate nucleosomes is computed for each function, our method produces a consensus set that satisfies non-overlapping constraints and maximizes the number of nucleosomes.
Results: We report comprehensive experimental results that compares PuFFIN with recently published tools (NSeq, NPS, NOrMAL and Template Filtering) on real datasets for S. cerevisiae, P. falciparum. Experimental results show that our approach is able to detect more non-overlapping nucleosomes than other available tools.