Research

Inferring the mechanisms of mutagenesis via whole-genome comparisons
    Our laboratory employs both computational and experimental methods to investigate mutations, or the changes in DNA. We scrutinize the variability in the rates and patterns of different types of mutations among regions of vertebrate genomes. Such inquiries allow us to make inferences about the molecular mechanisms leading to mutations. Mutations are the cause of human genetic diseases and the source of genetic variation in natural populations. Therefore, elucidating mutation mechanisms is critical for medicine and our understanding of evolution.
    The fact that, in mammals, the number of germline cell divisions is higher in males, producing billions of sperm, than in females, producing only hundreds of eggs, provides an opportunity to test whether mutations result from errors in DNA copying (replication) during cell divisions. If this hypothesis is true, we expect higher mutation rates in males than in females (male mutation bias) and, importantly, higher mutation rates in older than in younger males (paternal age effect). Thus, our results have implications for human health and genetic counseling. We demonstrated that not just substitutions of some DNA bases by others, but also insertions and deletions of DNA bases have higher rates in males than in females and thus result from errors in replication (Makova et al. 2004; Kvikstad et al. 2007). We have also discovered that human mutation hotspots (CpG sites) have weak male bias and thus confirmed that they are caused by spontaneous processes unrelated to DNA replication (Taylor et al. 2006). Finally, we observed that the magnitude of male bias correlates with generation time and can be confounded by intrachromosomal variation in mutation rates (Goetting-Minesky & Makova 2006).


    Our current studies of mutations reach beyond the male mutation bias phenomenon and are uncovering evolutionary mechanisms affecting overall genome evolution. For instance, we recently developed a computational model predicting mutation rates at microsatellites (repeats of short DNA motifs; Kelkar et al. 2008). This research is of significance to human health, because microsatellites are implicated in cancer and numerous neurological disorders. Additionally, our models will guide the choice of microsatellites for forensic applications and conservation genetics.
Mutability is per locus per generation. The bands around the curves indicate the 2.5th and 97.5th percentiles of empirical distributions obtained through a resampling procedure. Only points with at least 30 microsatellites are plotted (Kelkar et al. 2008).


Evolution of gene expression
    While there is a wealth of information about evolution of protein-coding genes, a detailed understanding about how gene expression changes over the course of evolution is currently lacking. Our work on X chromosome inactivation (XCI) and divergence of duplicate genes is providing novel insights into gene expression evolution.
    In mammalian females, the majority of genes on one of the X chromosomes are not expressed (undergo XCI) to achieve the same dosage with males. However, some genes in females are expressed from both X chromosomes, i.e. escape XCI. We used a bioinformatic approach to contrast local genomic environments of genes undergoing vs. escaping XCI (Carrel et al. 2006). As a result, we discovered that L1 repetitive elements are strongly enriched in the vicinity of genes expressed from only one X chromosome, supporting the role of these repeats in XCI. Additionally, we built a model that, based on genomic environment, accurately predicts whether each X chromosome gene undergoes or escapes XCI (Carrel et al. 2006).
The Distribution of Correctly and Incorrectly Classified Genes along the X Chromosome. Dark green indicates correctly classified genes; light green indicates misclassified genes. X inactivation expression patterns for genes included in this study: yellow indicates inactivated genes, and blue indicates escape genes. Not all genes were analyzed at all distances because sequences that included adjacent genes with different inactivation patterns were excluded from analysis. These gene distances remain uncolored (Carrel et al. 2006).


    Duplicate genes provide another fruitful framework to study evolution of gene expression. Using microarray data, we showed that, for duplicate genes, expression and sequence divergence correlate (Makova & Li 2003). More recently, we investigated duplicate genes in coexpression networks and found that, first, such genes evolve extremely rapidly by both losing common and acquiring new coexpression partners; and, second, duplicates play a similar role in maintaining the network robustness as compared with singletons (Chung et al. 2006).
Degree distribution of the studied network (T ≥ 7 and R ≥ 0.7). The degree distribution of the studied network (Chung et al. 2006).