Table 2.
The breakdown of the 90,612 Methyl-seq regions into specific annotated genomic elements and binding locations
CpG island coordinates were obtained from the UCSC genome browser where they were calculated for hg18 by using the definition from Gardiner-Garden and Frommer (1987), where islands have a GC content of 50% or greater, a length greater than 200 bp, and a ratio greater than 0.6 of observed number of CG dinucleotides to the expected number on the basis of the number of Gs and Cs in the segment (1987). Coordinates for promoters and other genic elements were calculated using the transcription start sites and gene annotations from the Known Genes file, compiled by the UCSC Genome Browser. Coordinates for strong CpG islands or high-CpG promoters (HCPs), weak CpG islands or intermediate-CpG promoters (ICPs), and sequences with no local enrichment of CpGs or low-CpG promoters (LCPs) were from Weber et al. (2007). Coordinates for regions of 7× regulatory potential, or regions (average score >0.5) conserved in human, chimpanzee, macaque, mouse, rat, dog, and cow, were obtained from King et al. (2005). Histone ChIP-chip binding data was from Pan et al. (2007), where “bivalent” regions were regions that overlapped with both H3K4me3 and H3K27me3 peaks. Methyl-seq regions overlapping with H3K4me3 peaks only were called H3K4me3; regions overlapping with H3K27me3 peaks only were called H3K27me3; and all unassigned regions were defined as “neither” (Pan et al. 2007).