Galaxy Help: Operations

Help: Operations

TABLE OF CONTENTS

Union
Intersection
Subtraction
Complement
Restrict region size
Restrict position
Proximity
Clusters
Merge overlapping regions
Join

Union

Collects all regions that appear in any of the given queries. There are two choices for how the union is performed:

Example 1: Return original regions

Query1:  -------       ----
Query2:      ------           -----

Result:  -------       ----
             ------           -----

Example 2: Return merged regions

Query1:  -------       ----
Query2:      ------           -----

Result:  ----------    ----   -----

Intersection

Finds regions that overlap between the given queries. There are three choices for how the intersection is performed:

Example 1: Return whole regions from query 1

Query1:  -------       ----
Query2:      ------           -----

Result:  -------

Example 2: Return whole regions from query 2

Query1:  -------       ----
Query2:      ------           -----

Result:      ------

Example 3: Return only overlapping segments

Query1:  -------       ----
Query2:      ------           -----

Result:      ---

Subtraction

Removes one query's regions from another query. There are two choices for how the subtraction is performed:

Example 1: Remove whole regions from query 1, if they overlap query 2

Query1:  -------       ----
Query2:      ------           -----

Result:                ----

Example 2: Remove only the overlapping segments from query 1

Query1:  -------       ----
Query2:      ------           -----

Result:  ----          ----

Complement

Finds all regions (in the same chromosomes) that are not in the given query.

Example: Return regions which are not in query 1

Query1:    --------      ----

Result: <--        ------    --------->

Restrict region size

Filters the given query based on the size of each region.

Example: Return regions from query 1 which have size >=100 bp

Query1:  ----     --------    -------    --
Sizes:   80 bp     160 bp     140 bp     40 bp

Result:           --------    -------

Filters the given query to select only regions that lie in a given area. The area can be specified as an entire chromosome or contig (e.g. "chr11"), or just part of it (e.g. "chr11: 5,203,270 - 5,204,877"). In the latter case, the spaces and commas are optional.

Example: Return regions from query 1 which lie in the specified area A

Query1:  ----     --------    -------    --
Area A:         ------------------

Result:           --------    -------

Proximity

Finds regions in the first query that either (1) lie within a specified distance from some region in the second query, or (2) lie farther than the specified distance from all regions in the second query.

Example 1: Return regions from query 1 that lie within 40 bp of any region in query 2 (each dash and star is 20 bp)

Query1:  ----     -------    -------    --
Query2:          ------              ---

Search:        **********          *******

Result:           -------    -------    --

Example 2: Return regions from query 1 that lie more than 40 bp from all regions in query 2 (each dash and star is 20 bp)

Query1:  ----     -------    -------    --
Query2:          ------              ---

Search:  <*****          **********       ********>

Result:  ----                -------

Clusters

Finds groups of regions from a single query that contain N or more regions within a M bp area. There are two choices for how the clustering is performed:

Example 1: Find clusters of at least 2 regions within 5 bp, and return the individual regions (here each dash and star is 1 bp, to show the sliding search window)

Query1:  ----     -------    -------    --
                     -----

Search:  *****
          *****
              ...
                 *****
                      ...
                         *****
                                ...
                                     *****

Result:           -------    -------
                     -----

Example 2: Find clusters of at least 2 regions within 5 bp, and return the cluster areas (here each dash and star is 1 bp, to show the sliding search window)

Query1:  ----     -------    -------    --
                     -----

Search:  *****
          *****
              ...
                 *****
                      ...
                         *****
                                ...
                                     *****

Result:           ------------------

Merge overlapping regions

Overlapping regions within a single query are consolidated into fewer, larger regions. (This is just like the "merged regions" option for Union.)

Example: Merge overlapping regions in query 1

Query1:  -------       ----
             ------           -----

Result:  ----------    ----   -----

Join

Similar to the operation known in the relational algebra of database systems as a "natural join". Two queries are matched up by region, and the result is a wider table with the data columns from the first query followed by the data columns from the second one. Only rows that match are included in the result.

There are two choices for how the matching is performed: (1) the regions must overlap by at least the specified number of bp, or (2) the endpoints must match exactly.