Table 1.
bHLH Genes Grouped into 44 Phylogenetically-Defined Families
Family name | Bootstrap support | No. of worm genes | No. of fly genes | No. of mouse genes | Plants-yeast? | High-order group |
---|---|---|---|---|---|---|
Achaete-Scute | intermediate1 | 4 | 4 | 2 | no | A |
Neurogenin | high | 1 | 1 | 3 | no | A |
NeuroD | intermediate | 1 | 0 to 22 | 4 | no | A |
Twist | high | 1 | 1 | 2 | no | A |
MyoD | high | 1 | 1 | 4 | no | A |
E12/E47 | high | 1 | 1 | 4 | no | A |
Atonal | high | 1 | 3 | 2 | no | A |
Mist | high | 0 or 13 | 1 | 1 | no | A |
Beta3 | intermediate | 1 or 23 | 1 | 2 | no | A |
MyoR | high | 1 | 1 | 2 | no | A |
Mesp | low | 0 | 1 | 2 | no | A |
Paraxis | high | 0 | 1 | 2 | no | A |
Hand | intermediate | 1 | 1 | 2 | no | A |
PTF1 | intermediate | 1 | 3 | 1 | no | A |
SCL | high | 0 | 1 | 4 | no | A |
NSCL | high | 1 | 1 | 2 | no | A |
SRC | high | 0 | 0 | 2 | no | B |
AHR | intermediate | 2 | 2 | 1 | no | C |
Sim | high | 0 or 14 | 1 | 2 | no | C |
Trh | high | 0 or 14 | 1 | 1 | no | C |
PAS1 | high | 0 or 14 | 0 | 1 | no | C |
HIF | high | 0 or 14 | 1 | 3 | no | C |
Max | high | 2 | 1 | 1 | no | B |
USF | low5 | 1 | 1 | 2 | yeast | B |
MITF | high | 1 | 0 | 4 | no | B |
AP4 | high | 1 | 1 | 1 | no | B |
TF4 | high | 0 | 2 | 1 | no | B |
Clock | high | 0 | 1 to 36 | 2 | no | C |
ARNT | high | 1 | 1 | 2 | no | C |
Bmal | high | 0 | 1 | 2 | no | C |
Myc7e | intermediate | 0 | 0 | 0 | plants only | B |
R | high | 0 | 0 | 0 | plants only | B |
SREBP | low7 | 1 | 1 | 2 | yeast | B |
Emc | high | 0 | 1 | 4 | no | D |
Myc | low8 | 0 | 1 | 4 | no | B |
Mad | high | 0 | 0 | 3 | no | B |
COE | **9 | 1 | 1 | 5 | no | F |
Gridlock | low10 | 0 | 2 | 2 | no | E |
Hairy | high | 0 | 3 | 1 | no | E |
E (spl) | low11 | 2 | 8 | 5 | no | E |
Pho4 | high | 0 | 0 | 0 | yeast only | B |
Sat1 | high | 0 | 0 | 0 | plants only | B |
GBOX | high | 0 | 0 | 0 | plants only | B |
RTG3P | intermediate | 0 | 0 | 0 | yeast only | B |
orphans | none | 6 | 0 to 412 | 0 | yes | 0 |
Families have been named according to the name (or its common abbreviation) of the first discovered or best known member of the family. Bootstrap support has been classified as high (>75%), intermediate (50%–75%) or low (<50%). The number of members per family in worm, fly, and mouse is reported. Each family has been tentatively assigned to a high-order group using the classification of Atchley and Fitch (1997) (see text for details). Fly and worm genes that cannot be assigned to any families are categorized as “orphan” genes.
Low (∼30%) when C. elegans sequences are included, one of which (C2812.8) being most divergent but consistently found linked to the Achaete-Scute family.
delilah and CG11450, two Drosophilagenes that group together (bootstrap value 52%) to the exclusion of any other gene, are often found associated with the NeuroD family; their inclusion in this family is, however, not supported by bootstrap resampling (see text and Fig. 3).
Beta3 and Mist are closely related families; one C. elegans sequence (DY33) is almost equally related to both families.
The Hif, Sim, Trh and PAS1 families form a strongly supported monophyletic group (bootstrap value 95%) which are collectively linked to a single C. elegans gene, F38A6.3(bootstrap value 65%). Because the Hif, Sim and Trh families contain both fly and mouse genes, F38A6.3 is unlikely to be the single worm ortholog of all these families. It is more likely that F38A6.3 is the ortholog of one of these families but has been displaced at the root of these families as a result of the long branch attraction phenomenon (see text).
Low bootstrap support (∼40%) when yeast genes are included; high support (75%) when animal genes only are considered.
One Drosophila gene, known as Dm clock, is found included in this family with high bootstrap support (91%); two other closely related genes Rst(1) JH and CG6211,are found associated with this family with lower bootstrap value (45%) and as outgroup to the other clock genes from both vertebrates and fly. Rst(1) JH is involved in metamorphosis and thus might represent a divergent clock gene specific to Drosophila.
Low bootstrap support (∼45%) when yeast and worm genes are included; high support (77%) when only vertebrate and fly genes are considered.
Low bootstrap support (∼30%) when the fly gene (named diminutive, dm) is included; high support (99%) when only vertebrate, echinoderms, and retrovirus genes are considered. Functional analysis of dm supports its orthology to the Myc family (Schreiber-Agus et al. 1997).
The COE genes were not included in our analysis as their HLH was not alignable with the other ones. Support for the monophyly of this family comes from the analysis of the HLH and the COE domain within this family.
High bootstrap support (100%) when only one of the two fly genes is considered (Dm Hey); lower support (42%) for the inclusion of a second fly gene (Dm Sticky ch1) in the family, as outgroup to Hey and gridlock genes from both vertebrates and fly.
The Gridlock, Hairy, and Enhancer of split families genes form a well-supported monophyletic group (group E; see Fig. 2). Two clear families (Hairy and Gridlock families) with high bootstrap support emerge from this group. All of the remaining sequences have been grouped in a single family (named Enhancer of split) which has no real phylogenetic support. A phylogenetic tree of the group can be found on the authors' Web site.
The total number of orphan genes depends on whether we do or do not include Rst(1) JH and CG6211 in the Clock family or delilah and CG11450 in the NeuroD family.