Part 4:  Finding SNPs that fall in suspected functional regions

Overview:
  1. Filter the input dataset (from Part 1) to keep only rows whose intervals overlap those in a library dataset of predicted regulatory regions.
  2. In a similar fashion, find rows in the same input dataset that overlap with those in an ENCODE regulatory dataset (DNase clusters) obtained from UCSC.
  3. Run the PhyloP tool on the same input dataset to add a column of interspecies conservation scores. Then use the Histogram tool to help choose a suitable score threshold, and filter the SNPs on the score column to keep only those at highly conserved positions.