Homework Assignment #2


Look for genomic regions that might be regulatory sites, using Galaxy and the UCSC Browser. Start at Galaxy.

Step 1: Upload a custom track of intervals that were predicted to be regulatory regions, noting that the coordinates in the human genome are relative to an outdated assembly, called hg17. To do this, perform the following operations. Click on "Get Data" (upper left), then on "Upload File". Then paste this URL into the window:
http://www.bx.psu.edu/~ross/share/PReMod_hg17.bed.txt
To tell Galaxy that this comes from build hg17 (not the current assembly), under "Genome" select hg17. Finally, click "Execute".

Step 2: Convert the annotation to the current human assembly by selecting "Lift-Over" and convert the coordinates to hg18.

Step 3: Upload annotation of all highly conserved regions in hg18 by the following choices. Get Data -> UCSC Main table browser and select group: Comparative Genomics, track:28-Way Most Cons. select region: genome, send output to Galaxy, get output, Send query to Galaxy. (It is a good idea to look at every set of data that you give Galaxy.)

Step 4: Find which predicted regulatory regions are highly conserved among mammals by: Operate on Genomic Intervals -> Intersect the intervals of two queries.

Step 5: View the interval in the UCSC Browser by clicking on the result of the intersection operation in the history panel (far right) and clicking on Display at UCSC main.

Write (in a plain text file) a short paragraph describing the number and location of these conserved putative regulatory regions around the CFTR gene, and send it to webb@bx.psu.edu. I'll be available Wednesday morning to answer questions, at webb@bx.psu.edu or in 506B Wartik Lab.