Info on data files supporting Hardison et al. (2003) Genome Research 13:13-26 Various measures of sequence similarity and genomic properties were computed in 1 Mb non-overlapping windows (file alnAncSub_GP_1Md4redSort.txt) and in 5 Mb windows with a 1Mb slide (overlapping by 4Mb) (file alnAncSub_GP_5M(1)d4redSort.txt). This description was written in 2008 when the data were posted to our website in response to a request. It is accurate to the best recollection of Ross Hardison, please let us know if there are issues or inconsistencies. The data are for the June 2002 human genome assembly, as aligned to the mouse assembly of February 2002 using blastZ and processing by axtBest. See Methods for the paper for more detail. These columns in each file are: chr chromosome name (number or X) start start position end end position sequencedBases number of nucleotides sequenced in the window noGapAlignedBases number of nucleotides in the window that are included in gap-free alignments (includes mismatches) matchingBases number of nucleotides in the window that match with mouse nonLSrepBases number of nucleotides that are not in lineage-specific repeats (hence are inferred to be ancestral) nonLSrepNoGapAlignedBases number of nucleotides that are not in lineage-specific repeats and are in gap-free alignments nonLSrepMatchingBases number of nucleotides that are not in lineage-specific repeats and match with mouse %id percent identity of aligned nucleotides in the window aln_tot fraction of nucleotides in the window that align with mouse aln_anc fraction of nucleotides that are not lineage-specific (i.e. ancestral) in the window that align with mouse fLSrep fraction of nucleotides that are in lineage-specific repeats Lsrep number of nucleotides that are in lineage-specific repeats fLSrep_noAlu fraction of nucleotides that are in lineage-specific repeats but excluding Alu repeats LSRep_noAlu number of nucleotides that are in lineage-specific repeats but excluding Alu repeats NA_anc fraction of a window inferred to be ancestral that does NOT align with mouse (i.e. 1-aln_anc) t_4d substitutions per 4-fold degenerate site, using the REV model to estimate substitution rates count_4d number of 4-fold degenerate sites in each window t_AR substitutions per AR (ancestral repeat) site, using the REV model to estimate substitution rates count_AR number of AR (ancestral repeat) sites in each window rcb recombination rate estimated from Kong et al (2002) data GC percent (G+C) for each window SNP count of SNPs in each window LS_L1 fraction of window occupied by lineage-specific L1 repeats All_L1 fraction of window occupied by all L1 repeats LS_LTR fraction of window occupied by lineage-specific LTR repeats All_LTR fraction of window occupied by all LTR repeats dGC difference in GC content between human and mouse CDs fraction of window occupied by coding sequences CpG fraction of window occupied by CpG dinucleotides