Skip to main content
Virus Evolution logoLink to Virus Evolution
. 2020 Feb 17;6(1):veaa010. doi: 10.1093/ve/veaa010

Linking influenza virus evolution within and between human hosts

Katherine S Xue v1,v2,v4, Jesse D Bloom v1,v2,v3,
PMCID: PMC7025719  PMID: 32082616

Abstract

Influenza viruses rapidly diversify within individual human infections. Several recent studies have deep-sequenced clinical influenza infections to identify viral variation within hosts, but it remains unclear how within-host mutations fare at the between-host scale. Here, we compare the genetic variation of H3N2 influenza within and between hosts to link viral evolutionary dynamics across scales. Synonymous sites evolve at similar rates at both scales, indicating that global evolution at these putatively neutral sites results from the accumulation of within-host variation. However, nonsynonymous mutations are depleted between hosts compared to within hosts, suggesting that selection purges many of the protein-altering changes that arise within hosts. The exception is at antigenic sites, where selection detectably favors nonsynonymous mutations at the global scale, but not within hosts. These results suggest that selection against deleterious mutations and selection for antigenic change are the main forces that act on within-host variants of influenza virus as they transmit and circulate between hosts.

Keywords: influenza virus, McDonald–Kreitman test, within-host evolution, antigenic drift, deep sequencing

1. Introduction

As influenza viruses replicate within infected hosts, they quickly mutate into genetically diverse populations. A small proportion of within-host variants transmit between individuals (McCrone et al. 2018; McCrone and Lauring 2018), and some transmitted variants continue to spread from person to person to circulate globally, and even reach fixation (Mideo, Alizon, and Day 2008; Alizon, Luciani, and Regoes 2011; Belshaw, Sanjuán, and Pybus 2011; Lumby, Nene, and Illingworth 2018; Xue et al. 2018) (Fig. 1). Influenza virus’s genetic variation within hosts therefore provides the material for its rapid global evolution.

Figure 1.

Figure 1.

Influenza virus evolves at within- and between-host (or global) evolutionary scales. (A) Viral mutations that arise within hosts can transmit from person to person and eventually contribute to global viral evolution. (B) Key parameters affecting influenza’s evolution within hosts. Estimates of infection duration, viral load, and symptom severity are summarized from meta-analyses of volunteer challenge studies (Carrat et al. 2008) and mathematical models of viral infection (Baccam et al. 2006; Beauchemin and Handel 2011). Viral mutation rates are summarized from studies by Suarez-Lopez and Ortin (1994), Nobusawa and Sato (2006), Sanjuán et al. (2010), and Bloom (2014). Transmission bottleneck size has recently been estimated from deep sequencing of viral samples from household transmission pairs (McCrone et al. 2018).

Several recent studies have used deep sequencing to identify within-host mutations in hundreds of clinical influenza infections (Dinis et al. 2016; Debbink et al. 2017; McCrone et al. 2018). However, it remains unclear what role these within-host mutations play in the global evolution of influenza virus. The same within-host mutations are only rarely observed in different individuals, and mutations that reach detectable frequencies at the global scale are not notably more common than other mutations within hosts (Debbink et al. 2017; McCrone et al. 2018). Specifically, although influenza virus displays rapid antigenic evolution on the global scale, antigenic variants are present at low frequencies within hosts (Dinis et al. 2016; McCrone et al. 2018). New quantitative approaches are therefore needed to understand how within-host viral variation is transformed into global evolution.

Here, we compare the genetic variation of H3N2 influenza virus within and between human hosts to infer the fates of within-host mutations in the global viral population. Individual within-host mutations are challenging to track on a global scale, but by comparing large numbers of genetic variants identified within and between hosts, we can infer the evolutionary forces that act on different classes of mutations. We analyze within-host viral variation in 308 acute influenza infections from three deep-sequencing datasets (Dinis et al. 2016; Debbink et al. 2017; McCrone et al. 2018), and we calculate rates of evolution within and between hosts to determine how selection and genetic drift shape viral evolution. Synonymous mutations accumulate at similar rates within and between hosts, but nonsynonymous mutations are less prevalent between hosts than they are within hosts across most of the influenza genome. These observations suggest that many nonsynonymous mutations that reach detectable frequencies within hosts are later purged as these variants circulate between hosts. In antigenic sites, however, nonsynonymous mutations accumulate more rapidly on a global scale than they do within hosts, suggesting that antigenic selection primarily takes place between hosts. Our results show that influenza populations within hosts are dominated by transient, deleterious mutations, which are later eliminated as they transmit and circulate between hosts. Selection against these deleterious mutations and selection for antigenic change are the main forces that act on within-host variants of influenza virus as they transmit and circulate between hosts.

2. Results

2.1 Rates of influenza virus evolution within hosts

Evolutionary rates provide a simple, quantitative framework for comparing viral variation across scales. Under neutral evolution, we expect within-host and global evolutionary rates to be identical, and deviations from these neutral expectations shed light on how selection acts across evolutionary scales. For instance, HIV and hepatitis C virus both evolve more rapidly within than between hosts, probably because viruses acquire adaptations to specific hosts that often revert after transmission (Herbeck et al. 2006; Lemey, Rambaut, and Pybus 2006; Pybus and Rambaut 2009; Gray et al. 2011; Lythgoe and Fraser 2012; Alizon and Fraser 2013; Zanini et al. 2015; Raghwani et al. 2018).

We sought to calculate the within-host and global evolutionary rates of influenza virus. Influenza’s global evolutionary rate is easily estimated from phylogenies of patient consensus sequences, which represent transmitted strains (Rambaut et al. 2008), but evolutionary rates during acute infections are more challenging to calculate. Several studies have estimated within-host evolutionary rates for chronic viruses like HIV and hepatitis C virus by sequencing longitudinal viral samples (Lemey, Rambaut, and Pybus 2006; Gray et al. 2011; Alizon and Fraser 2013; Raghwani et al. 2018). However, longitudinal samples are difficult to collect for viruses like influenza that cause acute infections, and acute infections also provide limited time for genetic diversity to accumulate. Errors that arise in library preparation and sequencing are often present at similar frequencies to genuine within-host genetic variation in acute viral infections, making it challenging to accurately estimate within-host evolutionary parameters (McCrone and Lauring 2016; Illingworth et al. 2017; Zhao and Illingworth 2019).

To overcome these challenges, we developed a method to estimate rates of within-host evolution from deep sequencing of acute viral infections. In brief, we identified within-host mutations that were present in at least 0.5 per cent of sequencing reads (Fig. 2, Section 4). We calculated the total genetic divergence in each viral sample by summing the frequencies of within-host mutations. We then normalized divergence to the number of available sites in the genome and time since the infection started (Supplementary Fig. S1) to estimate a rate of evolution per site, per day (Fig. 3, Section 4).

Figure 2.

Figure 2.

Within-host mutation calling criteria. (A) Within-host mutations were called when a nonconsensus base reached a frequency of 0.5 per cent at a site with at least 400× sequencing coverage. We analyzed only viral samples in which >80 per cent of the sites in each sequenced gene had ≥400× coverage. (B) The number of mutations called in each viral sample declines as the minimum mutation-frequency threshold increases. Outlier samples with high within-host variation were included in this plot but excluded from subsequent analyses (Supplementary Fig. S3). (C) Most within-host mutations called above a frequency of 0.5 per cent are present at <5 per cent frequency. (D) Within-host mutations at higher frequencies are more likely to be located at the third codon position. Sequencing errors, which occur at low frequencies, are expected to be evenly distributed across all three codon positions. In contrast, high-frequency mutations are disproportionately located at the third codon position, where mutations are often nonsynonymous, suggesting that mutations at higher frequencies are more likely to represent true viral mutations that have experienced purifying selection. The dashed line indicates the 0.5 per cent mutation-frequency threshold used in this study.

Figure 3.

Figure 3.

Within-host and global evolutionary rates reveal selective pressures acting on influenza virus across evolutionary scales. (A) Within-host evolutionary rates were calculated by normalizing the total divergence of each within-host viral population to the number of sites and the time elapsed since the infection began. (B) Global evolutionary rates were calculated using a molecular-clock method by performing a linear regression of divergence per site versus time. (C) Within-host and global evolutionary rates of H3N2 influenza virus at synonymous (S), nonsynonymous (NS), and stop-codon (Stop) sites. Rates of synonymous evolution are broadly similar at the within-host and global scale, but nonsynonymous mutations accumulate more rapidly within hosts than globally. Within-host rates are shown as the mean and standard error of all patient infections sequenced in the three datasets analyzed after removing outlier samples (see Section 4) (Dinis et al. 2016; Debbink et al. 2017; McCrone et al. 2018). Global rates are shown as the mean and standard error calculated through linear regression (the standard errors are smaller than the point sizes). Global rates of stop-codon evolution are zero because no stop codons are observed in patient consensus sequences.

We tested the sensitivity of this method to common technical considerations. Estimates of evolutionary rates can be influenced by the frequency threshold used to identify within-host variation (McCrone and Lauring 2016; Gallet et al. 2017; Grubaugh et al. 2019). We focused on within-host variants present in at least 0.5 per cent of sequencing reads at a site (Fig. 2A). At lower mutation-frequency thresholds, more mutations are identified in each viral sample (Fig. 2B), but many of these putative mutations result from errors in library preparation and sequencing. Higher mutation-frequency thresholds limit the influence of false-positive mutations but can also exclude true within-host variation, most of which is present at low frequencies (Fig. 2C). Our preferred mutation-frequency threshold of 0.5 per cent is relatively permissive but comfortably exceeds the 0.1 per cent threshold above which true variants begin to exceed sequencing errors, based on the proportion of putative mutations present at each codon position (Fig. 2D) (Dyrdak et al. 2019). Nevertheless, we also calculated within-host evolutionary rates at different mutation-frequency thresholds and found that the relative evolutionary rates of different genes and site classes remained consistent (Supplementary Fig. S2).

In calculating within-host evolutionary rates, we sought to limit the influence of outlier samples with unusually high within-host variation (Fig. 2B and Supplementary Fig. S3). This high variation can arise for biological reasons like coinfection with two distinct viral strains, or for technical reasons like poor sample quality or low viral load (McCrone and Lauring 2016; McCrone et al. 2018). However, both sources of variation artificially inflate estimates of how quickly viruses evolve within patient infections. To minimize the influence of outlier samples, we ranked all samples based on the number of within-host mutations and excluded the top 10 per cent of samples from each study. We also excluded samples that did not have at least 400× sequencing coverage in at least 80 per cent of the sites in each gene.

We calculated rates of within-host influenza evolution from deep-sequencing data in three published studies that together represented 308 acute H3N2 influenza infections (Fig. 3 and Supplementary Fig. S3) (Dinis et al. 2016; Debbink et al. 2017; McCrone et al. 2018). We estimated evolutionary rates separately for synonymous, nonsynonymous, and stop-codon (nonsense) mutations in each patient and each viral gene, and we averaged the rates estimated for each patient to calculate a single rate of evolution for each gene and mutation type. Nonsynonymous mutations are more common than synonymous mutations due to the structure of the genetic code, so we normalized evolutionary rates to the number of possible sites for each mutation type (Section 4). We limited our analysis to the six longest of the eight influenza virus genes for two reasons. First, there is less sampling noise in the estimates of evolutionary rates for longer genes because there are more sites. Second, the two shortest influenza genes have alternatively spliced and partially overlapping reading frames that complicate annotation of mutations as nonsynonymous or synonymous.

To assess sources of variation in our estimates, we calculated evolutionary rates separately for each deep-sequencing dataset (Supplementary Fig. S4). We also used a second method to estimate evolutionary rates, performing linear regression of viral divergence by the time since the infection started (Supplementary Fig. S5). In both cases, the resulting evolutionary rates supported the qualitative conclusions described below.

To compare rates of evolution in acute and chronic influenza infections, we also calculated rates of influenza evolution in four chronic H3N2 infections that lasted multiple months (Xue et al. 2017) (Supplementary Fig. S6). We find roughly similar rates of evolution in acute and chronic infections, except in hemagglutinin (HA) and neuraminidase (NA), which appear to evolve especially rapidly in these chronic infections due to selection for antigenic variation and antiviral resistance respectively. However, the small number of chronic infections constrains our interpretation.

2.2 Purifying selection acts weakly within hosts to eliminate deleterious variants

What kinds of mutations reach detectable frequencies within hosts? We compared how quickly synonymous, nonsynonymous, and stop-codon (nonsense) mutations accumulate within hosts. Differences in how quickly these three classes of mutations accumulate can reveal how selection acts within hosts. Most synonymous mutations are nearly neutral; many nonsynonymous mutations are deleterious; and premature stop codons are lethal. In the absence of selection, all three types of mutations accumulate at the same rate. However, purifying selection purges deleterious mutations, and positive selection increases the frequency of any beneficial mutations.

We find that deleterious mutations are purged detectably but incompletely within infected individuals. Synonymous mutations accumulate about twice as quickly as nonsynonymous mutations within hosts, with some variation across genes (Fig. 3C). Stop-codon mutations accumulate even more slowly within hosts than nonsynonymous mutations, but they are not completely purged. The depletion of nonsynonymous and stop-codon mutations relative to synonymous mutations demonstrates that viral populations within hosts experience purifying selection, as previous studies have observed (Dinis et al. 2016; McCrone et al. 2018; Moncla et al. 2020). However, the presence of stop-codon mutations at frequencies as high as 0.5 per cent within hosts suggests that purifying selection remains incomplete, allowing some strongly deleterious mutations to persist long enough to reach detectable frequencies (McCrone et al. 2018). These results show that purifying selection acts detectably but weakly within hosts to reduce the frequencies of deleterious mutations.

2.3 Synonymous mutations accumulate neutrally between hosts, but nonsynonymous mutations experience purifying and positive selection on a global scale

How do mutations that arise within hosts fare as they circulate between hosts? We compared rates of evolution within hosts and globally to determine how selection acts across different scales of viral evolution. Neutral mutations should have similar rates of evolution within and between hosts. Purifying selection that acts between hosts to eliminate deleterious mutations will decrease global evolutionary rates compared to rates within hosts. Conversely, positive selection that acts on a global scale will increase rates of evolution between hosts relative to their within-host expectations for beneficial classes of mutations.

To calculate the global evolutionary rates of influenza virus, we analyzed H3N2 influenza sequences in the Global Initiative on Sharing All Influenza Data (GISAID) EpiFlu database, where each sequence represents the consensus sequence of one patient infection (Bogner et al. 2006). We used a molecular-clock method to estimate global evolutionary rates at synonymous and nonsynonymous sites in each gene (Fig. 3B and Supplementary Fig. S7).

Synonymous mutations accumulate at similar rates within hosts and globally (Fig. 3C). This concordance of evolutionary rates suggests that synonymous evolution in the global influenza population represents the simple accumulation of within-host variation through multiple patient infections, as expected for mutations that are nearly neutral (Kimura 1968). Measurements of viral variation within hosts are noisy and incomplete due to technical limitations of deep sequencing (McCrone and Lauring 2016), but this similarity in evolutionary rates between scales indicates that studies of within-host variation still capture representative slices of between-host evolution.

In contrast, nonsynonymous mutations accumulate more slowly globally than within hosts in all six viral genes analyzed. In addition, while stop-codon mutations accumulate at appreciable rates within hosts, they never fix during global evolution. These discrepancies in evolutionary rates within and between hosts suggest that many nonsynonymous mutations that reach detectable frequencies within hosts are later purged by purifying selection as viruses transmit and circulate between hosts.

An important exception to this trend occurs in the antigenic regions of the HA gene, which evolve more rapidly on a global scale than non-antigenic regions of HA (Fig. 4) (Caton et al. 1982; Fitch et al. 1997; Wolf et al. 2006; Koel et al. 2013). Within hosts, synonymous and nonsynonymous mutations accumulate at similar rates in HA antigenic regions as in the rest of the HA gene, suggesting that selection does not detectably favor antigenic mutations within hosts. However, nonsynonymous mutations in antigenic regions of HA accumulate more rapidly globally than they do within hosts. This observation suggests that antigenic selection at the between-host scale is responsible for amplifying the frequency of nonsynonymous antigenic mutations that arise within hosts.

Figure 4.

Figure 4.

Antigenic and non-antigenic regions of hemagglutinin show similar patterns of variation within hosts, but antigenic regions evolve faster at the global scale. Within-host and global evolutionary rates were calculated as described in Fig. 3 for a set of previously defined antigenic sites (Wolf et al. 2006). Within-host rates are shown as the mean and standard error of all patient infections sequenced in the three datasets analyzed after removing outlier samples (see Section 4) (Dinis et al. 2016; Debbink et al. 2017; McCrone et al. 2018). Global rates are shown as the mean and standard error calculated through linear regression (in most cases, the standard errors are smaller than the point sizes). Global rates of stop-codon evolution are zero because no stop codons are observed in patient consensus sequences.

2.4 Transient deleterious variants within hosts are purged as viruses transmit and circulate between hosts

The previous section shows that nonsynonymous mutations are less common among mutations that fix globally than they are within hosts. This result suggests that purifying selection eliminates many nonsynonymous within-host variants before they fix in the global population of influenza viruses. However, it remains unclear from this analysis how quickly deleterious, nonsynonymous mutations that are generated within hosts are eliminated as viruses transmit and circulate between hosts.

We tracked the fraction of nonsynonymous mutations at a range of mutation frequencies within and between hosts to determine which classes of within-host mutations persist at the between-host scale (Fig. 5A). This analysis draws on the same logic that underlies the classical McDonald–Kreitman test for positive selection (McDonald and Kreitman 1991; Rand and Kann 1996; Bhatt, Katzourakis, and Pybus 2010; Bhatt, Holmes, and Pybus 2011; Garud et al. 2019). We expect purifying selection to purge deleterious nonsynonymous mutations from the viral population before they reach high mutation frequencies. In the absence of purifying selection, about 75 per cent of de novo mutations are expected to be nonsynonymous. However, purifying selection reduces the fraction of nonsynonymous mutations by removing deleterious mutations after they arise and before they fix. By examining when nonsynonymous mutations are depleted from circulating viral strains, we can identify where in the evolutionary process most purifying selection takes place (Fig. 5A).

Figure 5.

Figure 5.

Within-host viral populations harbor many transient deleterious mutations that are purged from the global influenza population. (A) The fraction of within-host and global mutations that are nonsynonymous reveals selective pressures that act across different evolutionary scales. In the absence of selection, about 75 per cent of de novo mutations are expected to be nonsynonymous. However, only about 20–50 per cent of the mutations that fix globally are nonsynonymous, suggesting that purifying selection acts within hosts and on globally circulating mutations to remove deleterious nonsynonymous variants. These plots illustrate two scenarios for how purifying selection acts on within-host and global viral populations. (B) The fraction of within-host and global mutations that are nonsynonymous. Within-host viral populations harbor many nonsynonymous mutations, though less than would be expected in the absence of selection (dashed line). Purifying selection acts strongly at transmission and in the early stages of global evolution to lower the fraction of nonsynonymous mutations. Dashed lines indicate the ∼75 per cent of de novo mutations that are expected to be nonsynonymous. The fraction of nonsynonymous mutations within hosts is shown as the mean and 95 per cent confidence interval of 100 bootstrap samples of the within-host viral populations. The fraction of nonsynonymous mutations in the global influenza population is shown as the mean, minimum, and maximum of the values calculated separately for the 2015–6, 2016–7, and 2017–8 H3N2 seasons.

Within hosts, about 55–70 per cent of mutations were nonsynonymous in each influenza gene (Fig. 5B). Because there are fewer nonsynonymous mutations within hosts than expected under de novo mutation alone, this result indicates that purifying selection acts detectably but weakly within hosts, supporting the conclusions of our analyses of evolutionary rates. Across a range of mutation frequencies within hosts, the fraction of nonsynonymous mutations remained relatively consistent, indicating that all mutations within hosts have experienced similar levels of purifying selection regardless of their mutation frequency.

In contrast, the fraction of nonsynonymous mutations drops sharply among viruses that circulate between hosts. More than 50 per cent of mutations within hosts are nonsynonymous, but only about 25 per cent of low-frequency between-host mutations are nonsynonymous in the polymerase genes (PB2, PB1, PA, and NP). The fraction of nonsynonymous mutations is even lower among between-host mutations that reach higher frequencies, and eventually, fewer than 20 per cent of the mutations that fix in these genes are nonsynonymous. Previous work has shown that rare variants in global viral populations are dominated by transient deleterious mutations (Pybus et al. 2007), so this decline in the proportion of nonsynonymous mutations at high frequencies likely occurs as purifying selection purges deleterious variation. Nevertheless, most deleterious nonsynonymous mutations are eliminated as these variants begin to transmit between hosts, before these mutations reach frequencies high enough to be detected by current global surveillance.

More nonsynonymous mutations are present between hosts in the viral surface proteins HA and NA. Nonsynonymous mutations are especially enriched between hosts in antigenic sites of HA, which experience strong positive selection on the global scale (Fig. 6) (Murphy, Kasel, and Chanock 1972; Caton et al. 1982; Fitch et al. 1997; Rambaut et al. 2008; Bhatt, Holmes, and Pybus 2011; Koel et al. 2013; Monto et al. 2015). In antigenic sites of HA, about 50 per cent of rare global mutations are nonsynonymous. The fraction of nonsynonymous mutations increases among more common mutations, and more than 75 per cent of mutations that eventually fix in the global population are nonsynonymous, as previously observed (Bhatt, Holmes, and Pybus 2011; Strelkowa and Lässig 2012). We hypothesize that purifying selection drives the initial decline in nonsynonymous, antigenic mutations among rare, circulating strains of influenza virus by purging deleterious mutations. Later, positive selection for antigenic variation likely increases the proportion of nonsynonymous mutations among commonly circulating strains. However, our analyses cannot distinguish the specific contributions of purifying and positive selection in shaping influenza’s global variation.

Figure 6.

Figure 6.

Mutations in antigenic sites of HA experience increased positive selection as viruses transmit and circulate between hosts. Fraction of within-host and global mutations that are nonsynonymous were calculated as described in Fig. 5 for a set of previously defined antigenic sites (Wolf et al. 2006). Dashed lines indicate the ∼75 per cent of de novo mutations that are expected to be nonsynonymous. The fraction of nonsynonymous mutations within hosts is shown as the mean and 95 per cent confidence interval of 100 bootstrap samples of the within-host viral populations. The fraction of nonsynonymous mutations in the global influenza population is shown as the mean, minimum, and maximum of the values calculated separately for the 2015–6, 2016–7, and 2017–8 H3N2 seasons.

Our analyses show that transient deleterious mutations make up a large proportion of influenza’s genetic variation within hosts. Deleterious variation is common at the between-host scale as well (Pybus et al. 2007; Bhatt, Holmes, and Pybus 2011) and can slow global rates of influenza virus evolution (Illingworth and Mustonen 2012; Koelle and Rasmussen 2015), but our findings show that deleterious mutations are substantially more prevalent within hosts. These results also show that purifying selection acts strongly to remove deleterious variation as viruses begin to transmit and circulate between hosts.

3. Discussion

In this study, we quantitatively compared influenza virus’s genetic diversity within and between human hosts. Our results show that although viral diversity may seem low within acute human influenza infections, it is sufficient to explain the virus’s rapid global evolution. Synonymous mutations accumulate at similar rates within and between hosts, as expected for neutral genetic changes. Most nonsynonymous mutations that reach detectable frequencies within hosts are eventually purged between hosts, suggesting that within-host influenza diversity consists mostly of transient, deleterious mutations that are eliminated at transmission and global evolution. Antigenic sites of HA are the only region of the influenza genome where mutations are consistently favored between hosts beyond their frequencies within hosts.

Our findings show how purifying and positive selection act on viral populations at different evolutionary scales. Influenza populations within hosts accumulate genetic variation as they expand during the first few days of an infection. However, most of this variation, which is deleterious, is purged as viruses circulate between hosts. One recent study estimates that influenza’s transmission bottleneck consists of one to two viral genomes (McCrone et al. 2018; Xue and Bloom 2019), and further study is required to determine whether and how this narrow transmission bottleneck helps eliminate deleterious variants.

Our observations of a substantial excess of deleterious viral mutations within hosts extend prior work showing that globally circulating strains of influenza carry a deleterious mutational load that influences the dynamics of adaptation (Pybus et al. 2007; Illingworth and Mustonen 2012; Strelkowa and Lässig 2012; Koelle and Rasmussen 2015). The nearly neutral theory of molecular evolution predicts that these transient, deleterious mutations will persist and sometimes even reach fixation when effective population sizes are small (Ohta 1992). We demonstrate that the proportion of deleterious mutations is higher within hosts than it is globally. We suggest that purifying selection acts inefficiently within hosts because within-host populations have small effective population sizes due to frequent transmission bottlenecks (McCrone et al. 2018). Coinfection and genetic complementation within hosts can also reduce the efficiency of selection by masking the effects of deleterious variation, as shown by the accumulation of defective interfering particles in vitro (Davis, Hiti, and Nayak 1980; Frensing et al. 2013; Brooke 2017). In addition, between-host viral populations experience selection for transmissibility, which imposes greater selective constraint than replication within hosts. This selection for transmissibility likely plays an important role in the depletion of nonsynonymous mutations at the between-host scale (Lumby, Nene, and Illingworth 2018; McCrone and Lauring 2018).

This finding of a high deleterious viral mutational load within human influenza infections agrees with analyses of other RNA viruses that cause acute infections. Studies of dengue (Holmes 2003) and Lassa virus (Andersen et al. 2015) have likewise identified an excess of nonsynonymous mutations within hosts compared to global viral populations, suggesting that many RNA viruses with rapid mutation rates and frequent transmission bottlenecks may accumulate transient, deleterious variation within hosts.

Our finding that nonsynonymous rates of evolution are faster at within- than between-host timescales extends previous work suggesting that evolutionary rates are time dependent and decline over long timescales (Ho et al. 2011; Duchêne, Holmes, and Ho 2014), with transition–transversion ratios in particular declining over time (Duchêne, Ho, and Holmes 2015). In keeping with previous work, we suggest that incomplete purifying selection explains why nonsynonymous sites evolve more rapidly within hosts, even though synonymous sites evolve at similar rates at both scales (Ho et al. 2011; Duchêne, Holmes, and Ho 2014). As viruses circulate over longer timescales between hosts, there are more opportunities for purifying selection to remove deleterious, nonsynonymous mutations.

Our work also clarifies how antigenic selection shapes influenza evolution within and between hosts. Antigenic mutations are not detectably enriched within hosts relative to other nonsynonymous mutations, suggesting that positive selection primarily acts between rather than within hosts to favor the mutations that drive influenza’s global antigenic evolution. Most antigenic selection may result from antibodies that prevent the initiation of new infections, and antigenic variants may be most strongly favored at transmission, where they are more likely to found new infections (Petrova and Russell 2018; Han, Maurer-Stroh, and Russell 2019). However, positive selection within hosts that acts on antigenic mutations below our limit of detection may still increase the chance that antigenic variants transmit successfully and found new infections.

This finding that antigenic selection is limited in acute influenza virus infections contrasts with the rapid antigenic evolution that has been observed in chronic influenza virus infections, where multiple viral lineages carrying distinct antigenic mutations arise and compete within a single patient (McMinn et al. 1999; Xue et al. 2017). These differences in antigenic evolution between acute and chronic infections may reflect the fact that acute infections provide limited time for selection to have detectable effects on antigenic variation. Further study is required to identify the exact immune responses and mechanisms that drive influenza’s antigenic evolution.

Deep sequencing makes it possible to examine viral evolution at high resolution within natural human infections, but it has remained unclear how within-host genetic variation is transformed into the macroscopic evolutionary dynamics that occur at a global scale. Our work places influenza’s within-host genetic diversity in the context of its between-host evolution and provides a general framework for linking viral evolutionary dynamics across scales.

4. Materials and Methods

4.1 Variant calling and annotation

Here, we summarize our general variant-calling pipeline, with study-specific modifications described in more detail below. We used cutadapt version 1.8.3 to trim Nextera adapters, remove bases at the ends of reads with a Q-score below 24, and filter out reads whose remaining length was shorter than 20 bases (Martin 2011). To determine the subtype of each sample, we used bowtie 2 version 2.2.3 on the –very-sensitive-local setting to map 1,000 reads from each sample to reference genomes for each subtype: A/Victoria/361/2011 (H3N2), A/California/04/2009 (pdmH1N1), and A/Boston/12/2007 (seasonal H1N1) (Langmead and Salzberg 2012). We classified the subtype of each sample based on which reference genome resulted in the highest mapping rate. For this study, we only analyzed samples of H3N2 influenza, which were the most common subtype sequenced in the datasets we analyzed.

For each sample, we mapped sequencing reads against the subtype reference genome using bowtie2 on the –very-sensitive-local setting (Langmead and Salzberg 2012). We tallied the counts of each nucleotide at each genome position and inferred the sample consensus sequence using custom scripts. We then remapped all reads from each sample against the sample consensus sequence using the –very-sensitive-local setting of bowtie2, and we removed duplicate reads using picard version 1.43 from the GATK suite (McKenna et al. 2010). We tallied the counts of each nucleotide at each genome position using custom scripts, discarding reads with a mapping score below 20 and bases with a Q-score below 20. We annotated mutations as synonymous or nonsynonymous based on their effect in the background of the sample consensus sequence.

We defined sites of putative within-host variation as positions in the genome with sequencing coverage of at least 400 reads at which a minority base exceeded a frequency of 0.5 per cent. In some cases, we also assessed the effects of using variant-frequency thresholds of 1 per cent and 2 per cent (Supplementary Fig. S2).

4.1.1 Study-specific modifications

We obtained sequencing data from the Dinis et al. (2016) study in the form of a single FASTQ file per sample containing first and second members of read pairs. We reconstructed read pairs for each sample using read-pair information in the FASTQ headers, and then we analyzed the reconstructed pairs as described above.

4.2 Sample exclusions

We first identified outlier samples with large amounts of within-host variation (Supplementary Fig. S3), since this high variation could be generated through coinfection or through poor sample quality and low viral load. For each study, we identified the top 10 per cent of samples based on the number of within-host variants above a frequency threshold of 0.5 per cent, and we excluded these samples from subsequent analyses. We also excluded samples that did not meet our minimum genome-coverage requirement that >80 per cent of the sites in each sequenced gene have ≥400× coverage, as well as plasmid controls.

The McCrone et al. (2018) study sequenced forty-three longitudinal pairs of samples. Each pair was collected from the same subject during a single illness, and the samples in a pair were collected 0–6 days apart. The first sample in each pair was typically collected by the study subject in a home setting, and the second sample was collected in a clinical setting. We expect samples in a longitudinal pair to have correlated patterns of viral diversity, so we excluded the earlier sample in each longitudinal pair.

After excluding these samples, we analyzed the remaining 308 of the original 411 samples of acute H3N2 influenza in these three datasets.

4.3 Within-host rates of evolution

To estimate within-host rates of evolution in acute influenza infections, we calculated the following quantity for each sequenced sample:

r=infint

Here, the rate of evolution r is calculated by first summing the frequencies f of within-host mutations present at a frequency of at least 0.5 per cent, then dividing by the number of available sites n and the time t since the infection began. If multiple mutations are present at a single site, then we sum the frequencies of all of them. We calculate this quantity separately for nonsynonymous, synonymous, and missense mutations in each influenza gene.

To expand, we calculated the average divergence of each viral population from its consensus sequence by summing the frequencies of synonymous, nonsynonymous, and stop-codon (nonsense) mutations within hosts in each influenza gene (Fig. 3A). Random mutations to the genome are more likely to be nonsynonymous than synonymous because of the structure of the genetic code, so we normalized sample divergence by the number of sites available for each type of mutation. We counted the proportion of sites available for synonymous, nonsynonymous, and nonsense mutations in the A/Victoria/361/2011 (H3N2) reference genome using a modified version of the Nei and Gojobori method that tallies nonsense mutations separately from nonsynonymous mutations (Nei and Gojobori 1986). In this estimate, we assumed that transitions are about three times as common as transversions based on previous studies (Sanjuán et al. 2010; Bloom 2014; Pauly, Procario, and Lauring 2017). Using these methods, we calculated that 72 per cent of mutations would be nonsynonymous, 25 per cent synonymous, and 3 per cent nonsense. To calculate the number of sites available for each type of mutation, we multiplied the length of the gene by the proportion of available sites. We then divided the viral divergence in each gene by the number of available sites to obtain a per-site viral divergence. We excluded the overlapping M1/M2 and NS1/NEP pairs of genes from all subsequent analyses because their out-of-frame overlap makes it possible for a single mutation to have multiple effects in the two genes.

To calculate rates of viral divergence per day, we normalized per-site viral divergence by the timing of sample collection. Of the 308 H3N2 samples that we analyzed, 251 had metadata describing the number of days post-symptom-onset after which the sample was collected. For the remaining samples sequenced by Dinis et al. (2016), we made use of aggregate data on the timing of sample collection relative to symptom onset for all H3N2 samples in the study, as we did not have access to data on the timing of sample collection for individual samples. The average timing of sample collection varied across studies (Supplementary Fig. S1). Symptom onset typically occurs 2–3 days after infection begins (Baccam et al. 2006; Carrat et al. 2008; Beauchemin and Handel 2011), so we added 2 days to the number of days post-symptom-onset to obtain the number of days postinfection (DPI). We calculated rates of viral divergence by dividing each sample’s per-site divergence by its DPI. We then calculated the mean and standard error of the viral divergence rates across all samples to obtain the values in Fig. 3C.

For comparison, we calculated rates of evolution in acute patient infections through both the ‘point’ method described above, which averages rates of evolution calculated from each patient infection, as well as through linear regression (Supplementary Fig. S5). We performed linear regression of per-site viral divergence by the sample DPI. Because we did not have access to sample-specific data on the timing of sample collection for the Dinis et al. (2016) dataset, we excluded the Dinis et al. (2016) samples from our estimate of evolutionary rates through linear regression.

To calculate within-host rates of evolution in antigenic sites of HA, we used the antigenic sites defined by Wolf et al. (2006), and we tallied the synonymous, nonsynonymous, and nonsense mutations at those codons.

4.4 Between-host evolutionary rates

We downloaded all full-length H3N2 influenza coding sequences in the GISAID EpiFlu database collected from 1 January 1999 to 31 December 2017 (Bogner et al. 2006). We randomly subsampled sequences for each gene to a maximum of 50 per year, we aligned these sequences to the coding regions of reference strains A/Moscow/10/1999 (H3N2) and A/Brisbane/10/2007 (H3N2) using the default settings of mafft version 7.407 (Katoh and Standley 2013), and we trimmed noncoding regions from each sequence. We calculated the synonymous and nonsynonymous distance between each sequence and both the Moscow/1999 and Brisbane/2007 reference sequences using custom scripts. We excluded outlier sequences whose distance substantially exceeded the distances of other sequences in that year, since these outlier sequences may have been misannotated. As in our calculation of within-host rates of evolution, we excluded the overlapping M1/M2 and NS1/NEP pairs of genes because their out-of-frame overlap makes it possible for a single mutation to have multiple effects in the two genes.

We used a molecular-clock approach to calculate global evolutionary rates. We performed linear regression of the distances from a reference sequence by the timing of sample collection to estimate the rate of between-host evolution (Fig. 3B and Supplementary Fig. S7). Molecular-clock methods can underestimate evolutionary rates over long periods of time as multiple mutations begin to occur at the same sites. To limit the effect of multiple-hit mutations on our analyses, we analyzed sequences collected over shorter intervals for more rapidly evolving genes. For the HA and NA genes, which evolve rapidly (Rambaut et al. 2008), we analyzed sequences from 2007 to 2017 relative to the Brisbane/2007 reference sequence. For the other genes, we analyzed a larger sequence set collected from 1999 to 2017 relative to the Moscow/1999 reference sequence. To calculate between-host rates of evolution in antigenic sites of HA, we used the antigenic sites defined by Wolf et al. (2006), and we tallied the synonymous and nonsynonymous mutations at those coding sites.

4.5 Chronic evolutionary rates

We previously deep-sequenced longitudinal samples of influenza from four chronic H3N2 influenza infections (Xue et al. 2017). We downloaded deep-sequencing data for these chronic infections from BioProject PRJNA364676. We mapped reads and identified within-host variants above a frequency of 0.5 per cent as described above, except that we calculated within-host variant frequencies and annotated mutation effects relative to the consensus sequence of the influenza population at the first sequenced timepoint for each patient. After calling within-host variants, we calculated per-site viral divergence at each sequenced timepoint as described above, and we performed linear regression of per-site viral divergence by the time elapsed since the infection began for each patient. To obtain the aggregate evolutionary rates shown in Supplementary Fig. S6A, we calculated the mean and standard error of the evolutionary rates estimated for each patient.

4.6 Fraction of mutations that are nonsynonymous within and between hosts

To calculate the fraction of mutations that are nonsynonymous within hosts, we tallied the total number of synonymous and nonsynonymous mutations in each frequency bin for each sample. Stop-codon mutations were counted as nonsynonymous mutations in this analysis. We then summed the variants in each frequency bin across samples. We calculated confidence intervals by performing 100 bootstrap resamplings of the viral samples and plotted the 95 per cent confidence interval from these bootstrap replicates (Fig. 5B).

To calculate the fraction of mutations that are nonsynonymous between hosts, we must be able to determine when the same mutation has arisen multiple times in the global influenza population. We downloaded all full-length H3N2 influenza coding sequences in the GISAID EpiFlu database collected from 1 July 2015 to 30 June 2018, together representing the 2015–6, 2016–7, and 2017–8 Northern Hemisphere flu seasons (Bogner et al. 2006). We grouped all strains into seasons beginning on 1 July and ending on 30 June and analyzed each season separately. Passaged sequences can generate false signals of positive selection due to lab-adaptation mutations (McWhite, Meyer, and Wilke 2016), so we retained only unpassaged sequences for downstream analyses. We aligned sequences from each season and each gene to the coding regions of reference strain A/Victoria/261/2011 (H3N2) using the default settings of mafft version 7.407 (Katoh and Standley 2013), and we trimmed noncoding regions from each sequence. We built a maximum-likelihood phylogeny for each gene and season using RAxML version 8.2.3 and the GTRCAT model (Stamatakis 2014). We rooted the resulting phylogenies with A/Victoria/261/2011 as the outgroup, and we reconstructed ancestral states at each node using TreeTime (Sagulenko, Puller, and Neher 2018). We wrote custom scripts that used the Bio.Phylo Biopython package (Talevich et al. 2012) to traverse each tree, identify the mutations at each node, assess whether the mutations had synonymous or nonsynonymous effects, and count the number of descendants from that node. For each mutation at a node, we also identified cases in which a second mutation occurred at the same codon in the descendent sequences, and we subtracted the descendent sequences carrying the second mutation from the total number of descendants of the original node. We calculated the frequencies of between-host variants by dividing the number of descendants by the total number of sequences in that season. Shaded intervals in Fig. 5B display the range of the proportion of nonsynonymous variants in the three seasons analyzed.

Data availability

Raw sequencing data are publicly available from the NCBI SRA database for Bioprojects PRNJA577940 (Dinis et al. 2016), PRJNA344659 (Debbink et al. 2017), PRJNA412631 (McCrone et al. 2018), and PRJNA364676 (Xue et al. 2017). The computer code that performs the analysis is available at https://github.com/ksxue/within-vs-between-hosts-influenza. Sequences downloaded from the GISAID Epiflu database are not available due to data-sharing restrictions, but acknowledgement tables for the sequences we analyzed are available at https://github.com/ksxue/within-vs-between-hosts-influenza/tree/master/data/GISAID/acknowledgements/H3N2.

Supplementary data

Supplementary data are available at Virus Evolution online.

Supplementary Material

veaa010_Supplementary_Data

Acknowledgements

We thank Tom Friedrich, Jorge Dinis, Kelsey Florek, Katarina Braun, Ed Belongia, and Jenny King for sharing the sequencing data and sample metadata for their earlier study, as well as Louise Moncla and Trevor Bedford for their helpful suggestions regarding these analyses. We also thank Allie Greaney, Louise Moncla, and Seungsoo Kim for their comments on the manuscript.

Funding

K.S.X. was supported by the Hertz Foundation Myhrvold Family Fellowship. J.D.B. was supported by grant R01AI127893 from the NIAID and as an Investigator of the Howard Hughes Medical Institute.

Conflict of interest: None declared.

References

  1. Alizon S., Fraser C. (2013) ‘Within-Host and Between-Host Evolutionary Rates across the HIV-1 Genome’, Retrovirology, 10: 49. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Alizon S., Luciani F., Regoes R. R. (2011) ‘Epidemiological and Clinical Consequences of Within-Host Evolution’, Trends in Microbiology, 19: 24–32. [DOI] [PubMed] [Google Scholar]
  3. Andersen K. G. et al. (2015) ‘Clinical Sequencing Uncovers Origins and Evolution of Lassa Virus’, Cell, 162: 738–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Baccam P. et al. (2006) ‘Kinetics of Influenza a Virus Infection in Humans’, Journal of Virology, 80: 7590–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Beauchemin C. A., Handel A. (2011) ‘A Review of Mathematical Models of Influenza a Infections within a Host or Cell Culture: Lessons Learned and Challenges Ahead’, BMC Public Health, 11: S7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Belshaw R., Sanjuán R., Pybus O. G. (2011) ‘Viral Mutation and Substitution: Units and Levels’, Current Opinion in Virology, 1: 430–5. [DOI] [PubMed] [Google Scholar]
  7. Bhatt S., Holmes E. C., Pybus O. G. (2011) ‘The Genomic Rate of Molecular Adaptation of the Human Influenza a Virus’, Molecular Biology and Evolution, 28: 2443–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bhatt S., Katzourakis A., Pybus O. G. (2010) ‘Detecting Natural Selection in RNA Virus Populations Using Sequence Summary Statistics’, Infection, Genetics and Evolution, 10: 421–30. [DOI] [PubMed] [Google Scholar]
  9. Bloom J. D. (2014) ‘An Experimentally Determined Evolutionary Model Dramatically Improves Phylogenetic Fit’, Molecular Biology and Evolution, 31: 1956–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bogner P. et al. (2006) ‘A Global Initiative on Sharing Avian Flu Data’, Nature, 442: 981. [Google Scholar]
  11. Brooke C. B. (2017) ‘Population Diversity and Collective Interactions during Influenza Virus Replication and Evolution’, Journal of Virology, 91: JVI.01164-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Carrat F. et al. (2008) ‘Time Lines of Infection and Disease in Human Influenza: A Review of Volunteer Challenge Studies’, American Journal of Epidemiology, 167: 775–85. [DOI] [PubMed] [Google Scholar]
  13. Caton A. J. et al. (1982) ‘The Antigenic Structure of the Influenza Virus a/PR/8/34 Hemagglutinin (H1 Subtype)’, Cell, 31: 417–27. [DOI] [PubMed] [Google Scholar]
  14. Davis A. R., Hiti A. L., Nayak D. P. (1980) ‘Influenza Defective Interfering Viral RNA Is Formed by Internal Deletion of Genomic RNA’, Proceedings of the National Academy of Sciences of the United States of America, 77: 215–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Debbink K. et al. (2017) ‘Vaccination Has Minimal Impact on the Intrahost Diversity of H3N2 Influenza Viruses’, PLOS Pathogens, 13: e1006194. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Dinis J. M. et al. (2016) ‘Deep Sequencing Reveals Potential Antigenic Variants at Low Frequencies in Influenza A Virus-Infected Humans’, Journal of Virology, 90: 3355–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Duchêne S., Ho S. Y., Holmes E. C. (2015) ‘Declining Transition/Transversion Ratios through Time Reveal Limitations to the Accuracy of Nucleotide Substitution Models’, BMC Evolutionary Biology, 15:36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Duchêne S., Holmes E. C., Ho S. Y. W. (2014) ‘Analyses of Evolutionary Dynamics in Viruses Are Hindered by a Time-Dependent Bias in Rate Estimates’, Proceedings of the Royal Society B: Biological Sciences, 281: 20140732. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Dyrdak R. et al. (2019) ‘Intra- and Interpatient Evolution of Enterovirus D68 Analyzed by Whole-Genome Deep Sequencing’, Virus Evolution, 5: vez007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Fitch W. M. et al. (1997) ‘Long Term Trends in the Evolution of H(3) HA1 Human Influenza Type A’, Proceedings of the National Academy of Sciences of the United States of America , 94: 7712–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Frensing T. et al. (2013) ‘Continuous Influenza Virus Production in Cell Culture Shows a Periodic Accumulation of Defective Interfering Particles’, PLoS One, 8: e72288. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Gallet R. et al. (2017) ‘The Number of Target Molecules of the Amplification Step Limits Accuracy and Sensitivity in Ultra Deep Sequencing Viral Population Studies’, Journal of Virology, 91: JVI.00561-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Garud N. R. et al. (2019) ‘Evolutionary Dynamics of Bacteria in the Gut Microbiome within and across Hosts’, PLoS Biology, 17: e3000102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Gray R. R. et al. (2011) ‘The Mode and Tempo of Hepatitis C Virus Evolution within and among Hosts’, BMC Evolutionary Biology, 11: 131. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Grubaugh N. D. et al. (2019) ‘An Amplicon-Based Sequencing Framework for Accurately Measuring Intrahost Virus Diversity Using PrimalSeq and iVar’, Genome Biology, 20: 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Han A. X., Maurer-Stroh S., Russell C. A. (2019) ‘Individual Immune Selection Pressure Has Limited Impact on Seasonal Influenza Virus Evolution’, Nature Ecology & Evolution, 3: 302–11. [DOI] [PubMed] [Google Scholar]
  27. Herbeck J. T. et al. (2006) ‘Human Immunodeficiency Virus Type 1 Env Evolves toward Ancestral States upon Transmission to a New Host’, Journal of Virology, 80: 1637–1644. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Ho S. Y. W. et al. (2011) ‘Time-Dependent Rates of Molecular Evolution’, Molecular Ecology, 20: 3087–101. [DOI] [PubMed] [Google Scholar]
  29. Holmes E. C. (2003) ‘Patterns of Intra- and Interhost Nonsynonymous Variation Reveal Strong Purifying Selection in Dengue Virus’, Journal of Virology, 77: 11296–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Illingworth C. J. R., Mustonen V. (2012) ‘Components of Selection in the Evolution of the Influenza Virus: Linkage Effects Beat Inherent Selection’, PLoS Pathogens, 8: e1003091. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Illingworth C. J. R. et al. (2017) ‘On the Effective Depth of Viral Sequence Data’, Virus Evolution, 3: vex030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Katoh K., Standley D. M. (2013) ‘MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability’, Molecular Biology and Evolution , 30: 772–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Kimura M. (1968) ‘Evolutionary Rate at the Molecular Level’, Nature, 217: 624–6. [DOI] [PubMed] [Google Scholar]
  34. Koel B. F. et al. (2013) ‘Substitutions near the Receptor Binding Site Determine Major Antigenic Change During Influenza Virus Evolution’, Science, 342: 976–9. [DOI] [PubMed] [Google Scholar]
  35. Koelle K., Rasmussen D. A. (2015) ‘The Effects of a Deleterious Mutation Load on Patterns of Influenza a/H3N2’s Antigenic Evolution in Humans’, eLife, 4: e07361. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Langmead B., Salzberg S. L. (2012) ‘Fast Gapped-Read Alignment with Bowtie 2’, Nature Methods, 9: 357–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Lemey P., Rambaut A., Pybus O. G. (2006) ‘HIV Evolutionary Dynamics within and among Hosts’, AIDS Reviews, 8: 125–40. [PubMed] [Google Scholar]
  38. Lumby C. K., Nene N. R., Illingworth C. J. R. (2018) ‘A Novel Framework for Inferring Parameters of Transmission from Viral Sequence Data’, PLoS Genetics, 14: e1007718. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Lythgoe K. A., Fraser C. (2012) ‘New Insights into the Evolutionary Rate of HIV-1 at the Within-Host and Epidemiological Levels’, Proceedings of the Royal Society B: Biological Sciences, 279: 3367–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. McCrone J. T., Lauring A. S. (2016) ‘Measurements of Intrahost Viral Diversity Are Extremely Sensitive to Systematic Errors in Variant Calling’, Journal of Virology, 90: 6884–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. McCrone J. T., Lauring A. S. (2018) ‘Genetic Bottlenecks in Intraspecies Virus Transmission’, Current Opinion in Virology, 28: 20–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. McCrone J. T. et al. (2018) ‘Stochastic Processes Constrain the within and between Host Evolution of Influenza Virus’, eLife, 7: e35962. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. McDonald J. H., Kreitman M. (1991) ‘Adaptive Protein Evolution at the Adh Locus in Drosophila’, Nature, 351: 652–4. [DOI] [PubMed] [Google Scholar]
  44. McKenna A. et al. (2010) ‘The Genome Analysis Toolkit: A MapReduce Framework for Analyzing Next-Generation DNA Sequencing Data’, Genome Research, 20: 1297–303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. McMinn P. et al. (1999) ‘Antigenic Drift of Influenza A (H3N2) Virus in a Persistently Infected Immunocompromised Host is Similar to That Occurring in the Community’, Clinical Infectious Diseases, 29: 456–8. [DOI] [PubMed] [Google Scholar]
  46. McWhite C. D., Meyer A. G., Wilke C. O. (2016) ‘Sequence Amplification via Cell Passaging Creates Spurious Signals of Positive Adaptation in Influenza Virus H3N2 Hemagglutinin’, Virus Evolution, 2: vew026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Martin M. (2011) ‘Cutadapt Removes Adapter Sequences from High-Throughput Sequencing Reads’, EMBnet.Journal, 17: 10. [Google Scholar]
  48. Mideo N., Alizon S., Day T. (2008) ‘Linking Within- and Between-Host Dynamics in the Evolutionary Epidemiology of Infectious Diseases’, Trends in Ecology & Evolution, 23: 511–7. [DOI] [PubMed] [Google Scholar]
  49. Moncla L. H. et al. (2020) ‘Quantifying within-Host Diversity of H5N1 Influenza Viruses in Humans and Poultry in Cambodia’, PLOS Pathogens, 16: e1008191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Monto A. S. et al. (2015) ‘Antibody to Influenza Virus Neuraminidase: An Independent Correlate of Protection’, Journal of Infectious Diseases, 212: 1191–9. [DOI] [PubMed] [Google Scholar]
  51. Murphy B. R., Kasel J. A., Chanock R. M. (1972) ‘Association of Serum Anti-Neuraminidase Antibody with Resistance to Influenza in Man’, New England Journal of Medicine, 286: 1329–32. [DOI] [PubMed] [Google Scholar]
  52. Nei M., Gojobori T. (1986) ‘Simple Methods for Estimating the Numbers of Synonymous and Nonsynonymous Nucleotide Substitutions’, Molecular Biology and Evolution, 3: 418–26. [DOI] [PubMed] [Google Scholar]
  53. Nobusawa E., Sato K. (2006) ‘Comparison of the Mutation Rates of Human Influenza a and B Viruses’, Journal of Virology, 80: 3675–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Ohta T. (1992) ‘The Nearly Neutral Theory of Molecular Evolution’, Annual Review of Ecology and Systematics, 23: 263–86. [Google Scholar]
  55. Pauly M. D., Procario M. C., Lauring A. S. (2017) ‘A Novel Twelve Class Fluctuation Test Reveals Higher than Expected Mutation Rates for Influenza a Viruses’, eLife, 6: e26437. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Petrova V. N., Russell C. A. (2018) ‘The Evolution of Seasonal Influenza Viruses’, Nature Reviews Microbiology, 16: 47–60. [DOI] [PubMed] [Google Scholar]
  57. Pybus O. G., Rambaut A. (2009) ‘Evolutionary Analysis of the Dynamics of Viral Infectious Disease’, Nature Reviews Genetics, 10: 540–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Pybus O. G. et al. (2007) ‘Phylogenetic Evidence for Deleterious Mutation Load in RNA Viruses and Its Contribution to Viral Evolution’, Molecular Biology and Evolution, 24: 845–52. [DOI] [PubMed] [Google Scholar]
  59. Raghwani J. et al. (2018) ‘Evolution of HIV-1 within Untreated Individuals and at the Population Scale in Uganda’, PLoS Pathogens, 14: e1007167. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Rambaut A. et al. (2008) ‘The Genomic and Epidemiological Dynamics of Human Influenza a Virus’, Nature, 453: 615–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Rand D. M., Kann L. M. (1996) ‘Excess Amino Acid Polymorphism in Mitochondrial DNA: Contrasts among Genes from Drosophila, Mice, and Humans’, Molecular Biology and Evolution, 13: 735–48. [DOI] [PubMed] [Google Scholar]
  62. Sagulenko P., Puller V., Neher R. A. (2018) ‘TreeTime: Maximum-Likelihood Phylodynamic Analysis’, Virus Evolution, 4: vex042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Sanjuán R. et al. (2010) ‘Viral Mutation Rates’, Journal of Virology, 84: 9733–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Stamatakis A. (2014) ‘RAxML Version 8: A Tool for Phylogenetic Analysis and Post-Analysis of Large Phylogenies’, Bioinformatics, 30: 1312–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Strelkowa N., Lässig M. (2012) ‘Clonal Interference in the Evolution of Influenza’, Genetics, 192: 671–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Suarez-Lopez P., Ortin J. (1994) ‘An Estimation of the Nucleotide Substitution Rate at Defined Positions in the Influenza Virus Haemagglutinin Gene’, Journal of General Virology , 75: 389–93. [DOI] [PubMed] [Google Scholar]
  67. Talevich E. et al. (2012) ‘Bio.Phylo: A Unified Toolkit for Processing, Analyzing and Visualizing Phylogenetic Trees in Biopython’, BMC Bioinformatics, 13: 209. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Wolf Y. I. et al. (2006) ‘Long Intervals of Stasis Punctuated by Bursts of Positive Selection in the Seasonal Evolution of Influenza a Virus’, Biology Direct, 1: 34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Xue K. S., Bloom J. D. (2019) ‘Reconciling Disparate Estimates of Viral Genetic Diversity during Human Influenza Infections’, Nature Genetics, 51: 1298–301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Xue K. S. et al. (2017) ‘Parallel Evolution of Influenza across Multiple Spatiotemporal Scales’, eLife, 6: e26875. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Xue K. S. et al. (2018) ‘Within-Host Evolution of Human Influenza Virus’, Trends in Microbiology, 26: 781–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Zanini F. et al. (2015) ‘Population Genomics of Intrapatient HIV-1 Evolution’, eLife, 4: e11282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Zhao L., Illingworth C. J. R. (2019) ‘Measurements of Intrahost Viral Diversity Require an Unbiased Diversity Metric’, Virus Evolution, 5: vey041. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

veaa010_Supplementary_Data

Data Availability Statement

Raw sequencing data are publicly available from the NCBI SRA database for Bioprojects PRNJA577940 (Dinis et al. 2016), PRJNA344659 (Debbink et al. 2017), PRJNA412631 (McCrone et al. 2018), and PRJNA364676 (Xue et al. 2017). The computer code that performs the analysis is available at https://github.com/ksxue/within-vs-between-hosts-influenza. Sequences downloaded from the GISAID Epiflu database are not available due to data-sharing restrictions, but acknowledgement tables for the sequences we analyzed are available at https://github.com/ksxue/within-vs-between-hosts-influenza/tree/master/data/GISAID/acknowledgements/H3N2.


Articles from Virus Evolution are provided here courtesy of Oxford University Press

RESOURCES