Basics of using Galaxy

The core concept of Galaxy's working paradigm is the user history, which is basically a list of your datasets at various stages of analysis. Each history item includes not only the dataset resulting from a particular computation, but also meta-information about it, such as the file format, genome assembly, and (if applicable) the tool and parameters used to obtain this dataset from earlier ones. Thus each analysis consists of a chain (or branched network) of steps, which are documented in your history for reproducibility. Moreover, the sequence of tools and parameters used can be extracted and saved as a workflow independent of the actual data, which can then be rerun automatically on additional input datasets of the same type. You can keep separate histories for each project, and even share histories and workflows with other users.

The data analysis page is made up of three parts. The left panel lists the available software tools organized by category, the center is where options are selected and results are displayed, and the right panel shows your history, with the resulting datasets from running tools. The history panel displays the name of the dataset, a short description, a peek at the first few lines (for text datasets), and links to additional options. It is good practice to create a new history for each analysis you do, and keep it as a record of how the analysis was done. This not only records what tools were run, but also the options that were used.

For each dataset, Galaxy keeps track of the data format and the genome assembly (for datasets containing genomic positions). If a tool takes multiple datasets as input, Galaxy will generally require that they are based on the same assembly (except for tools that are specifically designed to work across species or builds). It will also only allow you to select input datasets that are in a format compatible with the tool. Therefore, if the dataset you want to use as input for a tool is in your history but is not appearing as an available choice in the tool's options panel, check the assembly (a.k.a. "database") and format of the dataset to make sure they are correct. There are tools for converting between formats when needed.

[screen shot]