Input for SIFT

PolyPhen-2 is not the only way to predict functional coding changes with Galaxy; the SIFT tool in the Phenotype Association section performs a similar assessment. However, unlike the PolyPhen-2 library dataset which is pre-computed only for known SNPs in the dbSNP database, the SIFT tool can make predictions for all possible nucleotide substitutions in the human exome.

Each Galaxy tool typically has a description and other helpful information located below its input form, so to learn more about SIFT, open it and scroll down in the center panel. Here we find that the format needed is a little different than our pgSnp format (blue box); SIFT requires two alleles at each SNP position, but pgSnp format lists only a single nucleotide for homozygous genotypes (blue box in history panel).

The SIFT instructions suggest adding the reference nucleotide in such cases; this can be done within Galaxy, but involves quite a few steps. For any complex task that you are likely to repeat, it is helpful to have a workflow, which is like an automatic recipe. Happily there is a published workflow to convert from the pgSnp format to what SIFT needs, so we can do this in one step by using the workflow.

[screen shot]