Natural selection, genetic fitness and the role of math–part two

I’ve been thinking some more about this issue–the idea that selection should tend to favor those genotypes with the smallest temporal variations in fitness, for a given mean fitness value (above 1.00). It’s taken some time to work through this and get a grip on what’s going on and some additional points have emerged.

The first point is that although I surely don’t know the entire history, the idea appears to be strictly mathematically derived, from modeling: theoretical. At least, that’s how it appears from the several descriptions that I’ve read, including Orr’s, and this one. These all discuss mathematics–geometric and arithmetic means, absolute and relative fitness, etc., making no mention of any empirical origins.

The reason should be evident from Orr’s experimental description, in which he sets up ultra-simplified conditions in which the several other important factors that can alter genotype frequencies over generations, are made unvarying. The point is that in a real world experimental test you would also have to control for these things, either experimentally or statistically, and that would not be easy. It’s hard to see why anybody would go to such trouble if the theory weren’t there to suggest the possibility in the first place. There is much more to say on the issue of empirical evidence. Given that it’s an accepted idea, and that testing it as the generalization it claims to be is difficult, then the theoretical foundation had better be very solid. Well, I can readily conceive of two strictly theoretically-based reasons of why the idea might well be suspect. For time’s sake, I’ll focus on just one of those here.

The underlying basis of the argument is that, if a growth rate (interest rate, absolute fitness, whatever) is perfectly constant over time, the product of the series gives the total change at the final time point, but if it is made non-constant, by varying it around that rate, then the final value–and thus the geometric mean–will decline. The larger the variance around the point, the greater the decline. For example, suppose a 2% increase of quantity A(0) per unit time interval (g), that is, F = 1.020. Measuring time in generations here, after g = 35 generations, A(35) = F^g = 1.020^35 = 2.0; A is doubled in 35 generations. The geometric (and arithmetic) mean over the 35 years is 1.020, because all the yearly rates are identical. Now cause F to instead vary around 1.02 by setting it as the mean of a normal distribution with some arbitrarily chosen standard deviation, say 0.2. The geometric mean of the series will then drop (on average, asymptotically) to just below 1.0 (~ 0.9993). Since the geometric mean is what matters, genotype A will then not increase at all–it will instead stay about the same.

pstep = 0.00001; probs = seq(pstep, 1-pstep, pstep)
q = qnorm(p=probs, mean=1.02, sd=0.2)
gm = exp(mean((log(q)))); gm

This is a very informative result. Using and extending it, now imagine an idealized population with two genotypes, A and B, in a temporally unvarying selection environment, with equal starting frequencies, A = B = 0.50. Since the environment doesn’t vary, there is no selection on either, that is F.A = F.B = 1.0 and they will thus maintain equal relative frequencies over time. Now impose a varying selection environment where sometimes conditions favor survival of A, other times B. We would then repeat the above exercise, except that now the mean of the distribution we construct is 1.000, not 1.020. The resulting geometric mean fitness of each genotype is now 0.9788 (just replace 1.02 with 1.00 in the above code).

So what’s going to happen? Extinction, that’s what. After 35 generations, each will be down to 0.9788^35 = 0.473 of it’s starting value, on average, and on the way to zero. The generalization is that any population having genotypes of ~ equal arithmetic mean (absolute) fitness and normally distributed values around that mean, will have all genotypes driven to extinction, and at a rate proportional to the magnitude of the variance. If instead, one genotype has an arithmetic mean fitness above 1.00 a threshold value determined by it’s mean and variance, while all others are below it, then the former will be driven to fixation and the latter to extinction. These results are not tenable–this is decidedly not what we see in nature. We instead see lots of genetic variation, including vast amounts maintained over vast expanses of time. I grant that this is a fairly rough and crude test of the idea, but not an unreasonable one. Note that this also points up the potentially serious problem caused by using relative, instead of absolute, fitness, but I won’t get into that now.

Extinction of course happens in nature all the time, but what we observe in nature is the result of successful selection–populations and species that survived. We know, without question, that environments vary–wildly, any and all aspects thereof, at all scales, often. And we also know without question that selection certainly can and does filter out the most fit genotypes in those environments. Those processes are all operating but we don’t observe a world in which alleles are either eliminated or fixed. The above examples cannot be accurate mathematical descriptions of a surviving species’ variation in fitness over time–something’s wrong.

The “something wrong” is the designation of normally distributed variation, or more exactly, symmetrically distributed variation. To keep a geometric mean from departing from it’s no-variance value, one must skew the distribution around the mean value, such that values above it (x) are inverses (1/x) (mean/x) of those below it–that is the only way to create a stable geometric mean while varying the individual values. [EDIT: more accurately, the mean must equal the product of the values below the mean, multiplied by the mean divided by the product of the values above the mean, but the values will be skewed in any case.] Mathematically, the way to do so is to work with the logarithms of the original values–the log of the geometric mean is designated as the mean of normally distributed logarithms of the individual values, of whatever size variance one wants. Exponentiation of the sum of the logarithms will equal the product of the fitness series.

Hopefully, what I’m driving at is emerging. If the variance structure must obey this mathematical necessity to preserve a genotype’s mean fitness at 1.00, while still allowing the individual series values to vary…then why should we not expect the same to hold true when the mean geometric fitness is not equal to 1.00? I would argue that that’s exactly what we should expect, and that Gillespie’s original arguments–and Orr’s, and others’ summaries thereof–are not particularly defensible theoretical expectations of what is likely to be happening in nature. Specifically, the idea that the variance in fitness around an arithmetic mean should necessarily arise from symmetrically (normally) distributed values, is questionable.

As alluded to above, there is (at least) a second theoretical argument as well, but I don’t have time to get into it now (nor for this one for that matter). Suffice it to say that it involves simultaneous temporal changes in total population size and selective environments. All this without even broaching the entire hornet’s nest of empirically testing the idea, a topic reviewed five years ago by Simons. For starters, it’s not clear to me just how conservative “bet hedging” could ever be distinguished from the effects of phenotypic plasticity.


Simons, A.M. (2011) Modes of response to environmental change and the elusive empirical evidence for bet hedging. doi:10.1098/rspb.2011.0176

Other references are linked to in the previous post.


6 thoughts on “Natural selection, genetic fitness and the role of math–part two

  1. Hi Jim,
    I’ve been having trouble getting here on one computer but decided to work it out so I could respond to the last couple posts. Utterly fascinating for me, you have worked through some things I have been thinking about for a long time. While slowly digesting the math, I had a few empirical topics in mind as I was reading. You finished with this:

    “For starters, it’s not clear to me just how conservative “bet hedging” could ever be distinguished from the effects of phenotypic plasticity.”

    I was thinking of Pinus contorta, with considerable phenotype plasticity but perhaps irrelevant genotype plasticity. The advantages of this are clear. As long as there are environmental conditions somewhere for which the essentially fixed genome has high fitness, and that habitat can be exploited, the genome will persist. Pinus contorta is a colonizer of marginal habitats to which it is pre-adapted.
    Other plant groups seem to be doing the opposite. Both Streptanthus and the tribe that used to be known as Hesperolinon (now rolled into Linon under cladistics) have considerable in-species genotypic and phenotypic variation, and multiple closely related species that have considerable phenotypic variation. This seems to be an extremely risky approach in which habitats that used to be suitable no longer work because the genome has evolved away from fitness. These plants seem to have little ability to compete. They basically never compete with each other so besides perhaps fecundity, in-group selection is nil. These plants are colonizers of marginal habitats to which they are not pre-adapted.

    If I am connecting the dots correctly, Goldschmidt considered symmetric variation around the mean to be noise. I think he was right and I think you showed it. Does that explain Pinus contorta genotypic variation that does not seem to produce any effective phenotypic variation despite environmental conditions throughout the range that entice adaptation?

    It is indeed a real messy real world out there.

    • Matt! Glad to hear from you, thought maybe you’d met up with a sasquatch on less-than-ideal terms or something.

      Hang on, be a bit before I can respond.

    • I’m not sure about the specifics of genetic variability in lodgepole pine but I would bet that there’s a lot of it, given the enormous range of the species and the fact that conifers, and trees generally, have high amounts. For example, there are great variations in cone serotiny, both range-wide and just within specific locales in the Rocky Mountains, and the trait is pretty clearly genetically determined. But there’s undoubtedly some serious phenotypic plasticity as well (itself genetically determined)–just look at the rapid transition to a krummholz (shrubby) growth form over short vertical distances near treeline (many conifers exhibit this). We can safely assume that there’s little reproductive isolation over those short distances and thus, that the growth form is highly plastic for any given genotype. [There was recently a paper arguing for the evolutionary importance of selection on plastic phenotypes but damned if I can find it now.]

      What Goldschmidt thought about genetic variation generally I’m not sure. I do see that conservative bet hedging (CBH), and plasticity, fall in the same class in that, eventually, there’s no selection on genetic variation going on. So, would CBH advocates then likely be partial to arguments of plasticity? If not, how would they explain the survival of a genetically un-varying population/species? Just by some vanilla phenotype that’s neither highly or poorly adapted, across environments, but “good enough”? I much doubt that as an explanation holding water generally. You have to invoke plasticity at some point, and we know trainloads from molecular and cell biology that there are multiple avenues for such responses at the level of gene and protein regulation.

      However, that doesn’t mean I discount the case for selection on genotypic variability. Not sure what Goldschmidt might have meant by “noise”–true that sometimes it will exist but be irrelevant to selection, due to there being no particular advantage for any genotype (which is what I assume he would mean), but other times this will decidedly not be so. You can still have high variance in fitness of different genotypes, each of which gets selected for, then against, back and forth, temporally. The only real problem is that you do not want genotypes with extremely low fitness–that can drive that genotype (and possibly the population as a whole) to extinction. But whether it does, and how long it takes if it does, will depend entirely on the specifics of the fitness distribution–specifically, whether the low values are offset by high fitness values such that the product of all possible values is 1.0 or above. Note that that’s different than what Orr and etc. are arguing–they’re saying it’s strictly a matter of the variance in fitness. Well, it isn’t–that’s clearly a wrong mathematical conclusion.

      Not quite sure if I’m addressing your exact point though. My main point with the plasticity comment was that I really don’t know how you’d ever definitely test the idea as a general law operating across taxa. Maybe one can set up some bacterial system, and particular gene, and demonstrate that you can produce what Orr describes, but so what? That’s a far cry from demonstrating anything ubiquitous and important.

    • The Goldschmidt stuff I was remembering ended up being a bit far afield from this discussion.

      I really like your response to me above. You’ve obviously spent some time on this. You wrote:

      “You can still have high variance in fitness of different genotypes, each of which gets selected for, then against, back and forth, temporally.”

      That’s what happens with Pinus contorta, IMO. Cone serotiny is a perfect example of a variable trait that will never result in any directional change in the basic genome.
      I’ve now read the papers you linked and also the exchange at Dynamic Ecology. Unfortunately, the basic math that is “well known by everyone in the field” is all new to me so very slow going. Back 20+ years ago I developed my own intuitive ideas about what I called (in engineering lingo) “maintaining a stable (genetic) baseline.” I did discover that my ideas mapped closely to the concept of stabilizing selection, and now “conservative bet hedging” seems to be part of that as well. So I think the concept is valid, but the simplifications that have to be made about fitness, selection, and adaptation in order to exploit simple mathematical representations are problematical to me. Level of selection also seems important to me. I would argue that individual phenotypes of P. contorta do not compete against each other in any meaningful sense, but do face competition from other trees. With stabilizing selection at the species level, the level of selection is the species. With highly variable annuals that exhibit negligible stabilizing selection, the level of selection is all the way down at the lowest reproducing unit, the meristem. I’m trying to understand how all that can be simplified down to a fitness trajectory along a mean with superimposed variance.
      Just rambling here…

    • If the Orr example is proposed as a demonstration of useful mathematical theory, then I’m sticking strongly with the empiricists, that’s for sure. The math is confused and its supposed conclusions are un-testable except in the most extremely simplified, unrepresentative conditions. Damn waste of time is what it is.

Have at it

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s