Phenotype Association Tools in Galaxy

Part 2: Selecting known coding SNPs predicted to be damaging, then finding their genes and associated pathways

Overview:

Import a public library file containing pre-computed results from running PolyPhen-2 on the dbSNP database.
Join the input dataset with the PolyPhen-2 results row-by-row, based on interval overlap. This adds new columns to the input set, including the UniProt protein accession ID and the predicted effect of the SNP.
Filter the results to select rows containing the word "damaging".
Translate the UniProt IDs to HUGO gene symbols by joining with an identifier table imported from UCSC.
Run the CTD tool to extract curated pathway associations for these genes from the Comparative Toxicogenomics Database.