I am the ride

I awoke and someone spoke–they asked me in a whisper
If all my dreams and visions had been answered
I don’t know what to say–I never even pray
I just feel the pulse of universal dancers
They’ll waltz me till I die and never tell me why–
I’ve never stopped to ask them where we’re going
But the holy, the profane, they’re all helplessly insane
Wishful, hopeful, never really knowing

They asked if I believe, and do the angels really breathe?
Or is it all a comforting invention?
It’s just like gravity I said–it’s not a product of my head
It doesn’t speak but nonetheless commands attention
I don’t care what it means, or who decorates the scenes
The problem is more with my sense of pride
It keeps me thinking me, instead of what it means to be
But I’m not a passenger, I am the ride
I’m not a passenger, I am the ride

I Am The Ride, Chris Smither

Nobody knows

Nobody knows about what’s going on
With the wood and the steel, the flesh and the bone
The river keeps flowing and the grass still grows
And the spirit keeps going, but nobody knows

Poets they come and the poets they go
Politicians and preachers–they all claim to know
Words that are written and the melodies played
As the years turn their pages, they all start to fade

The ocean still moves with the moon in the sky
The grass still grows on the hillside
Got to believe in believin’
Got to believe in a dream
Freedom is ever deceiving
Never turning out to be what it seems

It’s amazing how fast our lives go by
Like a flash of lightning, like the blink of an eye
We all fall in love as we fall into life
We look for the truth on the edge of the night
Heavens turn ’round and the river still flows
How the spirit keeps going, nobody knows

Nobody Knows, Gregg Allman, Allman Brothers
(Chords here)

What’s his name again?

Yeah, it happens when the money comes:
The wild and poor get pushed aside
It happens when the money comes

Buyers come from out of state
They raise the rent and you can’t buy
Buyers come from out of state and raise the rent

“Buy low, sell high, you get rich!”
You still die
Money talks and people jump
Ask “How high?”
Low-life Donald…what’s-his-name?

And who cares?
I don’t want to know what his wife
Does or doesn’t wear
It’s a shame the people at work
Want to hear about this kind of jerk

I walk where the bottles break
And the blacktop comes on back for more
I walk where the bottles break
And the blacktop comes on back

I live where the neighbors yell
And their music comes up through the floor
I live where the neighbors yell
And their music wakes me up

Where the bottles break, John Gorka, 1991

SABR-toothed

Well they’ve been running around on the flat expanses of the early Holocene lake bed with impressively large machines, whacking down and gathering the soybeans and corn. This puts dirt clods on the roads that cause one on a road bike at dusk to weave and swear, but I digress. The Farmer’s Almanac indicates says that it must therefore be about World Series time, which in turn is just about approximately guaranteed to initiate various comments regarding the role of luck, good or bad, in deciding important baseball game outcomes.

There are several important things to be blurted out on this important topic and with the Series at it’s climax and the leaves a fallin’ now’s the time, the time is now.

It was Bill James, the baseball “sabermetric” grandpa and chief guru, who came up with the basic idea some time ago, though not with the questionable terminology applied to it I think, which I believe came later from certain disciples who knelt at his feet.

The basic idea starts off well enough but from there goes into a kind of low-key downhill slide, not unlike the truck that you didn’t bother setting the park brake for because you thought the street grade was flat but found out otherwise a few feet down the sidewalk. At which point you also discover that the bumper height of said truck does not necessarily match that of a Mercedes.

The concept applies not just to baseball but anything involving integer scores. Basic idea is as follows (see here). Your team plays 162 baseball games, 25 soccer matches or whatever, and of course you keep score of each. You then compute the fraction S^x/(S^x + A^x), where using the baseball case, S = runs scored, A = runs allowed and x = an exponent that varies depending on the data used (i.e. the teams and years used). You do this for each team in the league and also compute each team’s winning percentage (WP = W/G, where W = number of wins and G = games played in the season(s)). A nonlinear regression/optimization returns the optimal value of x, given the data. The resulting fraction is known as the “pythagorean expectation” of winning percentage, claiming to inform us of how many games a given team “should” have won and lost over that time, given their total runs scored and allowed.

Note first that the value of x depends on the data used: the relationship is entirely empirically derived, and exponents ranging from (at least) 1.8 to 2.0 have resulted. There is no statistical theory here whatsoever, and in no description of “the pythag” have I ever seen any mention of such. This is a shame because (1) there can and should be, and (2) it seems likely that most “sabermatricians” don’t have any idea as to how or why. Maybe not all, but I haven’t seen any discuss the matter. Specifically, this is a classic case for application of Poisson-derived expectations.

However the lack of theory is one, but not really the main, point here. More at issue are the highly questionable interpretations of the causes of observed deviations from pythag expectations, where the rolling truck smashes out the grill and lights of the Mercedes.

You should base an analysis like this on the Poisson distribution for at least two very strong reasons. First, interpretations of the pythag always involve random chance. That is, the underlying view is that departures of a given team’s won-loss record from pythag expectation is always attributed to the action of randommness–random chance. Great, if you want to go down that road, that’s exactly what the Poisson distribution is designed to address. Secondly, it will give you additional information regarding the role of chance that you cannot get from “the pythag”.

Indeed, the Poisson gives the expected distribution of integer-valued data around a known mean, under the assumption that random deviations from that mean are solely the result of sampling error, which in turn results from the combination of Complete Spatial Randomness (CSR) complete randomness of the objects, relative to the mean value and the size of the sampling frame. In our context, the sampling frame is a single game and the objects of analysis are the runs scored, and allowed, in each game. The point is that the Poisson is inherently designed to test just exactly what the SABR-toothers are wanting to test. But they don’t use it–they instead opt for the fully ad-hoc pythag estimator (or slight variations thereof). Always.

So, you’ve got a team’s total runs scored and allowed over its season. You divide that by the number of games played to give you the mean of each. That’s all you need–the Poisson is a single parameter distribution, the variance being a function of the mean. Now you use that computer in front of you for what it’s really ideal at–doing a whole bunch of calculations really fast–to simply draw from the runs scored, and runs allowed, distributions, randomly, say 100,000 times or whatever, to estimate your team’s real expected won-loss record under a fully random score distribution process. But you can also do more–you can test whether either the runs scored or allowed distribution fits the Poisson very well, using a chi-square goodness-of-fit test. And that’s important because it tells you basically, whether or not they are homogeneous random processes–processes in which the data generating process is unchanging through the season. In sports terms: it tells you the degree to which the team’s performance over the year, offensive and defensive, came from the same basic conditions (i.e. unchanging team performance quality/ability).

The biggest issue remains however–interpretation. I don’t how it all got started, but somewhere, somebody decided that a positive departure from “the pythag” (more wins than expected) equated to “good luck” and negative departures to “bad luck”. Luck being the operative word here. Actually I do know the origin–it’s a straight forward conclusion from attributing all deviations from expectation to “chance”. The problem is that many of these deviations are not in fact due to chance, and if you analyze the data using the Poisson as described above, you will have evidence of when it is, and is not, the case.

For example, a team that wins more close games than it “should”, games won by say just one or two runs, while getting badly smoked in a small subset of other games, will appear to benefit from “good luck”, according to the pythag approach. But using the Poisson approach, you can identify whether or not a team’s basic quality likely changed at various times during the season. Furthermore, you can also examine whether the joint distribution of events (runs scored, runs allowed), follows random expectation, given their individual distributions. If they do not, then you know that some non-random process is going on. For example, that team that wins (or loses) more than it’s expected share of close games most likely has some ability to win (or lose) close games–something about the way the team plays explains it, not random chance. There are many particular explanations, in terms of team skill and strategy, that can explain such results, and more specific data on a team’s players’ performance can lend evidence to the various possibilities.

So, the whole “luck” explanation that certain elements of the sabermetric crowd are quite fond of and have accepted as the Gospel of James, may be quite suspect at best, or outright wrong. I should add however that if the Indians win the series, it’s skill all the way while if the Cubs win it’ll most likely be due to luck.

The same thing that I want today…

Well I’m sailing away my own true love
I’m sailing away in the morning
Is there somethin’ I can send you from across the sea
From the place that I’ll be landin’

No there’s nothin’ you can send me, my own true love
There’s nothin’ I’m wishing to be ownin’
Just carry yourself back to me unspoiled
From across that lonesome ocean

Oh but I just thought you might like somethin’ fine
Made of silver, or of golden
From the mountains of Madrid
Or from the coast of Barcelona

Well if I had the stars of the darkest night
And the diamonds from the deepest ocean
I’d forsake them all for your sweet kiss
For it’s all I’m wishin’ to be ownin’

And I might be gone a long, long time
And it’s only that I’m askin’
Is there somethin’ I can send you to remember me by
To make your time more easy a-passin’

How can, how can you ask me again?
It only a-brings me sorrow
The same thing that I want today
I’ll want again tomorrow

Oh and I got a letter on a lonesome day
It was from her ship a-sailin’
Sayin’ I don’t know when I’ll be coming back again–
It depends on how I’m feelin’

Well if you my love must think that a-way
I’m sure your mind is a-roamin’
I’m sure your thoughts are not ’bout me
But with the country where you’re goin’

So take heed, take heed of the western wind
Take heed of stormy weather
And yes, there’s something you can send back to me:
Spanish boots of Spanish leather

Boots of Spanish Leather, Bob Dylan

Natural selection, genetic fitness and the role of math–part two

I’ve been thinking some more about this issue–the idea that selection should tend to favor those genotypes with the smallest temporal variations in fitness, for a given mean fitness value (above 1.00). It’s taken some time to work through this and get a grip on what’s going on and some additional points have emerged.

The first point is that although I surely don’t know the entire history, the idea appears to be strictly mathematically derived, from modeling: theoretical. At least, that’s how it appears from the several descriptions that I’ve read, including Orr’s, and this one. These all discuss mathematics–geometric and arithmetic means, absolute and relative fitness, etc., making no mention of any empirical origins.

The reason should be evident from Orr’s experimental description, in which he sets up ultra-simplified conditions in which the several other important factors that can alter genotype frequencies over generations, are made unvarying. The point is that in a real world experimental test you would also have to control for these things, either experimentally or statistically, and that would not be easy. It’s hard to see why anybody would go to such trouble if the theory weren’t there to suggest the possibility in the first place. There is much more to say on the issue of empirical evidence. Given that it’s an accepted idea, and that testing it as the generalization it claims to be is difficult, then the theoretical foundation had better be very solid. Well, I can readily conceive of two strictly theoretically-based reasons of why the idea might well be suspect. For time’s sake, I’ll focus on just one of those here.

The underlying basis of the argument is that, if a growth rate (interest rate, absolute fitness, whatever) is perfectly constant over time, the product of the series gives the total change at the final time point, but if it is made non-constant, by varying it around that rate, then the final value–and thus the geometric mean–will decline. The larger the variance around the point, the greater the decline. For example, suppose a 2% increase of quantity A(0) per unit time interval (g), that is, F = 1.020. Measuring time in generations here, after g = 35 generations, A(35) = F^g = 1.020^35 = 2.0; A is doubled in 35 generations. The geometric (and arithmetic) mean over the 35 years is 1.020, because all the yearly rates are identical. Now cause F to instead vary around 1.02 by setting it as the mean of a normal distribution with some arbitrarily chosen standard deviation, say 0.2. The geometric mean of the series will then drop (on average, asymptotically) to just below 1.0 (~ 0.9993). Since the geometric mean is what matters, genotype A will then not increase at all–it will instead stay about the same.

pstep = 0.00001; probs = seq(pstep, 1-pstep, pstep)
q = qnorm(p=probs, mean=1.02, sd=0.2)
gm = exp(mean((log(q)))); gm

This is a very informative result. Using and extending it, now imagine an idealized population with two genotypes, A and B, in a temporally unvarying selection environment, with equal starting frequencies, A = B = 0.50. Since the environment doesn’t vary, there is no selection on either, that is F.A = F.B = 1.0 and they will thus maintain equal relative frequencies over time. Now impose a varying selection environment where sometimes conditions favor survival of A, other times B. We would then repeat the above exercise, except that now the mean of the distribution we construct is 1.000, not 1.020. The resulting geometric mean fitness of each genotype is now 0.9788 (just replace 1.02 with 1.00 in the above code).

So what’s going to happen? Extinction, that’s what. After 35 generations, each will be down to 0.9788^35 = 0.473 of it’s starting value, on average, and on the way to zero. The generalization is that any population having genotypes of ~ equal arithmetic mean (absolute) fitness and normally distributed values around that mean, will have all genotypes driven to extinction, and at a rate proportional to the magnitude of the variance. If instead, one genotype has an arithmetic mean fitness above 1.00 a threshold value determined by it’s mean and variance, while all others are below it, then the former will be driven to fixation and the latter to extinction. These results are not tenable–this is decidedly not what we see in nature. We instead see lots of genetic variation, including vast amounts maintained over vast expanses of time. I grant that this is a fairly rough and crude test of the idea, but not an unreasonable one. Note that this also points up the potentially serious problem caused by using relative, instead of absolute, fitness, but I won’t get into that now.

Extinction of course happens in nature all the time, but what we observe in nature is the result of successful selection–populations and species that survived. We know, without question, that environments vary–wildly, any and all aspects thereof, at all scales, often. And we also know without question that selection certainly can and does filter out the most fit genotypes in those environments. Those processes are all operating but we don’t observe a world in which alleles are either eliminated or fixed. The above examples cannot be accurate mathematical descriptions of a surviving species’ variation in fitness over time–something’s wrong.

The “something wrong” is the designation of normally distributed variation, or more exactly, symmetrically distributed variation. To keep a geometric mean from departing from it’s no-variance value, one must skew the distribution around the mean value, such that values above it (x) are inverses (1/x) (mean/x) of those below it–that is the only way to create a stable geometric mean while varying the individual values. [EDIT: more accurately, the mean must equal the product of the values below the mean, multiplied by the mean divided by the product of the values above the mean, but the values will be skewed in any case.] Mathematically, the way to do so is to work with the logarithms of the original values–the log of the geometric mean is designated as the mean of normally distributed logarithms of the individual values, of whatever size variance one wants. Exponentiation of the sum of the logarithms will equal the product of the fitness series.

Hopefully, what I’m driving at is emerging. If the variance structure must obey this mathematical necessity to preserve a genotype’s mean fitness at 1.00, while still allowing the individual series values to vary…then why should we not expect the same to hold true when the mean geometric fitness is not equal to 1.00? I would argue that that’s exactly what we should expect, and that Gillespie’s original arguments–and Orr’s, and others’ summaries thereof–are not particularly defensible theoretical expectations of what is likely to be happening in nature. Specifically, the idea that the variance in fitness around an arithmetic mean should necessarily arise from symmetrically (normally) distributed values, is questionable.

As alluded to above, there is (at least) a second theoretical argument as well, but I don’t have time to get into it now (nor for this one for that matter). Suffice it to say that it involves simultaneous temporal changes in total population size and selective environments. All this without even broaching the entire hornet’s nest of empirically testing the idea, a topic reviewed five years ago by Simons. For starters, it’s not clear to me just how conservative “bet hedging” could ever be distinguished from the effects of phenotypic plasticity.

References

Simons, A.M. (2011) Modes of response to environmental change and the elusive empirical evidence for bet hedging. doi:10.1098/rspb.2011.0176

Other references are linked to in the previous post.

On natural selection, genetic fitness and the role of math

I really am not quite sure what to make of this one.

Last week at the blog Dynamic Ecology it was argued that natural selection behaves like a “risk-averse” money investor. That is, assuming that fitness varies over time (due to e.g. changing environmental variables or other selective factors), natural selection favors situations in which the mean fitness is maximized while the variance is minimized. The idea is explained in this short paper by Orr (2007), whose goal was to explain previous findings (Gillespie, 1973) intuitively. This presumes that knowledge of investor behavior is commonplace, but for my money, an examination of the math details and assumptions is what’s really needed.

This conclusion seems entirely problematic to me.

Continue reading

The memo from above

Late last week a useful memo came down from the powers that be here at The Institute that I thought might prove informative regarding the inner workings of a powerful think tank, which The Institute most certainly is, in spades.

To: Personnel engaged in primarily predictive and related prognosticatory research
From: The PTB
Date: September 30, 2016

We wish, as always, to express our appreciation for the excellent, ongoing work that continues to move The Institute steadily forward, at roughly the cutting edge of science, or at least at the cutting edge of rough science. Accordingly, we take this opportunity to remind everyone of the basic tenets that have guided our various predictive activities in the past:

(1) Future events and event trajectories, notwithstanding our best efforts, continue to display an aggravating uncertainty, and it is remarkable just how easily this fact avoids taking up residence in our conscious minds.

(2) The future occupies a fairly large, and apparently non-diminishing, portion of the temporal spectrum.

(3) Given the above, it is incumbent upon us all to keep in mind the following:
(a) Phrasing article titles with undue certainty, given the actual knowledge of system behavior, while understandable from a science culture perspective, may be counter-productive in a larger context. Fortunately, many non-scientists tend to seize upon such titles and, lacking proper restraint, make them even worse, often proclaiming future event x to be a virtual certainty. Without the ability to re-direct attention to these exaggerations, often originating from the press and various activist groups, undue attention to our own excesses, for which we have no readily available excuse, could become noticeably more uncomfortable. This possibility is not in the best interest of either science or The Institute.

(b) Science doesn’t actually “prove” anything, proof being a rather archaic and overly harsh concept–a “bar too high” if you like. Rather, science is in the business of “suggesting” that certain things “may” happen somewhere “down the road”. Science, when you boil it right down to nails, is really nothing but a massive pile of suggestions of what might happen. The pile is the thing really and our goal is to contribute to it. Popper is entitled to his opinion but frankly, The Institute is not so arrogant as to assume the right of making judgments on this, that or the other members of said scientific pile.

(c) It is hoped that the relation of points (a) and (b) above do not require elaboration.

Sincerely,
The PTB

This is an excellent reminder and I have, personally, tacked this memo to the wall in front of my workstation, with intent to glance at it every now and then before tacking something else over top of it.

Recognition For Review

I just found out that the second annual Peer Review Week is well underway. There are several online articles on the topic, perhaps best found via Twitter searches using #RecognizeReview or #PeerRevWk16, or via links at the link above.

This year’s theme thereof is Recognition For Review, and in that context it’s perfect timing, relative to a peer review post that I already had in mind. I don’t think there’s any question that the peer review process as a whole has very major problems, ones which greatly weaken the clarity, efficiency and reliability of the scientific process. These problems originate largely in the design of the review process, which in turn affect review execution. However, this reality doesn’t preclude the fact that thousands of people perform excellent review work, daily. And they’re not getting much credit for it either.

Some attention then, to one of the most interesting, important–and puzzling–reviews I’ve ever seen. Occasionally a paper comes out which is worth paying intense attention to, for reasons that go beyond just its technical content, and this is surely one in my opinion. The review and paper in question are publicly available at Atmospheric Chemistry and Physics (ACP). This was a long, involved review on a long, involved paper. If you have limited time to devote to this, go read Peter Thorne’s ICARUS article, a summary of his overall review experience.

The journal is one of a set of European Geophysical Union (EGU) journals that have gone to a completely open review process. The commenting process is online and open to anyone, although two or more official reviewers are also designated by the editor, who (unlike volunteer reviewers) may remain anonymous if they choose. For this open process alone the EGU deserves major recognition and gratitude, as it is arguably the single biggest step that can be taken to improve the peer review process. Everything has to be open.

There is a lot to say on this and I’ll start with the puzzling aspect of it. The article in question’s lead author is James Hansen, arguably still the most famous climate scientist in the world. Several of the reviews show that the article’s main claims are quite contentious, relative to the evidence and analysis presented, as summarized most completely by Thorne’s two reviews, the second of which–a phenomenal piece of review work–also summarizes Hansen et al’s responses (and non-responses) to the numerous reviewer comments, a job which presumably should really have fallen to the editor.

I’ve not yet worked all the way through everything, but you can’t read it and not wonder about some things. The authors didn’t have to submit their paper to an open review journal. So why did they? Did they assume the claims of the paper were largely non-contentious and it would thus slide smoothly through review? But given the clearly important claims, why not then submit to a highly prominent journal like Science or Nature for maximum attention and effect? Maybe they did, had it rejected and this was the second or third submission–I don’t know.

A second issue, one of several that did not sit at all well with Thorne, was the fact that Hansen et al. notified members of the press before submission, some of whom Thorne points out then treated it as if it were in fact a new peer reviewed paper, which it surely was not. When confronted on this point, Hansen was completely unapologetic, saying he would do the same thing again if given the chance, and giving as his reason the great importance of the findings to the world at large, future generations in particular. What? That response pretty well answers the question regarding his confidence in the main conclusions of the paper, and is disturbing in more than one way.

Thorne was also not at all pleased with Hansen’s flippant and/or non-responses to some of the review comments, and for this he took him severely to task for his general attitude, especially given the major weaknesses of the paper. The most important of the latter was the fact that there was no actual, model connection between the proposed processes driving rapid ice sheet melt, and the amount of fresh water flowing into the oceans to drive the rapid sea level rise that is the main claim of the paper. Rather, that flow was prescribed independently of the ice melt processes in what amounted to a set of “what if” scenarios more or less independent of the model’s ice melt dynamics. More importantly, this highly important fact was not clear and prominent: it had to be dug out by careful reading, and moreover, Hansen essentially denied that this was in fact the case.

There are major lessons here regarding conduct of peer review, how scientists should behave (senior scientists in particular), and scientific methodology. Unfortunately, I have no more time to give this right now–and I would give it a LOT more if I did. This is thus largely a “make aware” post. The paper and its review comprise a case study in many respects, and requires a significant commitment. I personally have not seen a more important paper review in a very long time, if ever. Peter Thorne, some of the other volunteer reviewers, and ACP, deserve recognition for this work.

Please do not fire off any uninformed comments. Thanks.

Too many memories

I remember this town, with a girl by my side
And a love seldom found, in this day and time
And it gets melancholy, every now and again
When you let your mind go, and it drifts way back when
Now life plays its tricks, some cruel but fair
And even a fool can’t pretend they don’t care

When there’s too many memories for one heart to hold
Once a future so bright now seems so distant and cold
And the shadows grow long and your eyes look so old
When there’s too many memories for one heart to hold

There are those moments, and they just never fade
Like the look in her eyes and the way the light played
God moved in that moment, and the angels all cried
And they gave you a memory that you’ll have ’til you die
Now the lesson you learned, and you don’t dare forget
What makes you grow old is replacing hope with regret

And there’s too many memories for one heart to hold
Once a future so bright, now seems so distant and cold
And the shadows grow long, and your eyes look so old
When there’s too many memories for one heart to hold

The late Stephen Bruton, Too Many Memories
(Thanks to Mike Flynn for playing the Tom Rush cover of this last night on his great show, The Folk Sampler)

“Why the Americans Are More Addicted to Practical Than to Theoretical Science”

Those who cultivate the sciences among a democratic people are always afraid of losing their way in visionary speculation. They mistrust systems; they adhere closely to facts and the study of facts with their own senses. As they do not easily defer to the mere name of any fellow-man, they are never inclined to rest upon any man’s authority; but, on the contrary, they are unremitting in their efforts to point out the weaker points of their neighbor’s opinions. Scientific precedents have very little weight with them; they are never long detained by the subtlety of the schools, nor ready to accept big words for sterling coin; they penetrate, as far as they can, into the principal parts of the subject which engages them, and they expound them in the vernacular tongue. Scientific pursuits then follow a freer and a safer course, but a less lofty one.

Tocqueville Vol 2 frontis

The mind may, as it appears to me, divide science into three parts. The first comprises the most theoretical principles, and those more abstract notions, whose application is either unknown or very remote. The second is composed of those general truths, which still belong to pure theory, but lead nevertheless by a straight and short road to practical results. Methods of application and means of execution make up the third. Each of these different portions of science may be separately cultivated, although reason and experience show that none of them can prosper long, if it be absolutely cut off from the two others.

In America the purely practical part of science is admirably understood, and careful attention is paid to the theoretical portion which is immediately requisite to application. On this head the Americans always display a clear, free, original, and inventive power of mind. But hardly any one in the United States devotes himself to the essentially theoretical and abstract portion of human knowledge. In this respect the Americans carry to excess a tendency which is, I think, discernible, though in a less degree, among all democratic nations.

Continue reading

Does the Poisson scale up?

I often get obsessed with certain topics, especially statistical and mathematical ones. Lately I’ve been thinking about the Poisson distribution a lot, as it figures heavily in analyses of random populations, and thus also in assessing deviations therefrom, a topic that I’ve been heavily involved for a while, relative to historic and current tree populations. I also find that often a specific question will arise, related to whatever it is I’m doing, that is best (and often quickest) answered in a way that gives me the most confidence, by direct simulation, for which I use R.

The topic of varying scales of variation, and the information that can be obtained by analysis thereof, is a pretty interesting one IMO. When the generating processes for a phenomenon of interest are complex, or poorly understood for whatever reason, one can (and should!) obtain valuable information regarding likelihoods of various hypothetical process drivers, by multi-scale analysis–essentially, obtaining evidence for the scale(s) at which departures from randomness are greatest and using that information to suggest, or constrain, possible explanations. There are a number of ways to go about doing so.

The Poisson distribution is the appropriate descriptor of a homogeneous (constant) rate process whose individual event outcomes are random. An “under-dispersed” population at a particular scale of analysis will be more “regular” in its arrangement than expected by a random process, and it is clear that in such a situation there must necessarily also be under-dispersion at at least some other scales as well, both smaller and larger. To illustrate via an extreme example, suppose some location that gets 36 inches of precipitation (P) per year on average, distributed as exactly three inches per month, every month. The probability of such a result arising, when P varies randomly (Poisson) at any sub-monthly scale, is extremely low. It won’t occur over any extended period of time. The same principle holds, though muted, if there is some monthly variance around 3.0.

In an over-dispersed (“clustered”) population, again at some defined scale, the situation is different. Such a population will also be over-dispersed at smaller scales, but not necessarily at larger ones, at least not at the same intensity, and there should be some unknown scale at which the variation reduces to the Poisson. This means that a Poisson population is not necessarily Poisson at smaller scales, but it should be so at larger scales. That is, it should “scale up” according to Poisson expectation, i.e. with the same rate but a greater absolute number, and variance therein, per sample unit.

But does it? Or rather, what does R think about the matter?

Well, here’s what I get, using the example case of a mean annual P/yr of 36″ and 100,000 simulated monthly, or weekly, sums from randomly sampling the Poisson expectation at sub-interval (weekly or daily) time scales.

rm(list=ls())
options(digits=3)
# 1. Observed annual mean and corresponding scaled sub-int. means.  Year = 364, month = 28, days.
obs.ann.mn = 36			# observed annual mean, from record
(monthly.mn = obs.ann.mn/13)	# 13 months/yr! 
(weekly.mn = obs.ann.mn/52)
(daily.mn = obs.ann.mn/364)

# 2. Poisson CDF expectations, ignore slight variations in days:
 # Equal interval CDF probs determined by no. time intervals in a year, eases interpr.
 # Set CDF probs to correspond to the various time ints. to give temporal distribution 
 # NOTE that qpois, for cdf = 1.0, = Inf., so omit last interval
# Poisson
P.Pmonth = qpois(p=(1:12)/13, lambda = monthly.mn) 		# 13 mos
P.Pweek = qpois(p=(1:51)/52, lambda = weekly.mn)		# 52 weeks
P.Pday = qpois(p=(1:363)/364, lambda = daily.mn)		# 364 days
table (P.Pmonth); table(P.Pweek); table(P.Pday)

# 3. Simulations: do repeated samples taken from shorter time periods and summed, match Poisson/gamma expectations at longer periods?
n.trials = 1e5
P.month.week = rep(NA,n.trials)
 for (i in 1:n.trials) P.month.week[i] = sum(sample(P.Pweek, 4, replace=T))			# Exactly 4 weeks to our months
 q.P.month.week = as.vector(quantile(P.month.week, probs = (1:12)/13)); rm(P.month.week)
P.month.day = rep(NA,n.trials)
 for (i in 1:n.trials) P.month.day[i] = sum(sample(P.Pday, 28, replace=T))
 q.P.month.day = as.vector(quantile(P.month.day, probs = (1:12)/13)); rm(P.month.day)
P.week.day = rep(NA,n.trials)
 for (i in 1:n.trials) P.week.day[i] = sum(sample(P.Pday, 7, replace=T))
 q.P.week.day = as.vector(quantile(P.week.day, probs = (1:51)/52)); rm(P.week.day)

mw = data.frame(table (P.Pmonth), table(q.P.month.week))[,-3]; colnames(mw)=c("Precip, monthly", "Poisson Expect.", "Aggr., weekly")
md = data.frame(table (P.Pmonth), table(q.P.month.day))[,-3]; colnames(md)=c("Precip, monthly", "Poisson Expect.", "Aggr., daily")
wd = data.frame(table (P.Pweek), table(q.P.week.day))[,-3]; colnames(wd)=c("Precip, weekly", "Poisson Expect.", "Aggr., daily")
mw; md; wd

Answer: Yes, it does exactly.*

Precip, monthly 	Poisson Exp. 	Aggr., weekly
               1               3             3
               2               3             3
               3               3             3
               4               2             2
               5               1             1
Precip, monthly 	Poisson Exp.	Aggr., daily
               1               3            3
               2               3            3
               3               3            3
               4               2            2
               5               1            1
Precip, weekly 		Poisson Exp.	 Aggr., daily
               0              26           26
               1              18           18
               2               6            6
               3               1            1

*However, I also evaluated gamma expectations, mainly as a check and/or curiosity (the gamma interpolates between the Poisson’s integer values). I didn’t always get the exact correspondence as expected, and I’m not really sure why. Close, but not close enough to be due to rounding errors, so that’s kind of interesting, but not enough to pursue further.

Funding for this post was provided in equal part by the French Fish Ectoderm and Statistics Foundation and the American Association for Advancement of Amalgamated and Aggregated Associations. These organizations are solely responsible for any errors herein, and associated management related consequences.

Twitter science

Discussing science on the internet can be interesting at times, even on Twitter, which seems to have been designed specifically to foster misunderstanding by way of brevity. Here are two examples from my week.

Early in the week, Brian Brettschneider, a climatologist in Alaska, put up a global map of monthly precipitation variability:
Brettschneider map
Brian said the metric graphed constitutes the percentiles of a chi-square goodness-of-fit test comparing average monthly precipitation (P) against uniform monthly P. I then made the point that he might consider using the Poisson distribution of monthly P as the reference departure point instead, as this was the more correct expectation of the “no variation” situation. Brian responded that there was no knowledge, or expectation, regarding the dispersion of data, upon which to base such a decision. That response made me think a bit, and I then realized that I was thinking of the issue in terms of variation in whatever driving processes lead to precipitation measured at monthly scales, whereas Brian was thinking strictly in terms of the observations themselves–the data as they are, without assumptions. So, my suggestion was only “correct” if one is thinking about the issue the way I was. Then, yes, the Poisson distribution around the overall monthly mean, will describe the expected variation of a homogeneous, random process, sampled monthly. But Brian was right in that there is no necessary reason to assume, apriori, that this is in fact the process that generated the data in various locations.

The second interchange was more significant, and worrisome. Green Party candidate for President, physician Jill Stein, stated “12.3M Americans could lose their homes due to a sea level rise of 9ft by 2050. 100% renewable energy by 2030 isn’t a choice, it’s a must.” This was followed by criticisms, but not just by the expected group but also by some scientists and activists who are concerned about climate change. One of them, an academic paleoecologist, Jacquelyn Gill, stated “I’m a climate scientist and this exceeds even extreme estimates“, and later “This is NOT correct by even the most extreme estimates“. She later added some ad-hominem barbs such as “That wasn’t a scientist speaking, it was a lawyer” and “The point of Stein’s tweet was to court green voters with a cherry-picked figure“. And some other things that aren’t worth repeating really.

OK so what’s the problem here? Shouldn’t we be criticizing exaggerations of science claims when they appear in the mass culture? Sure, fine, to the extent that you are aware of them and have the time and expertise to do so. But that ain’t not really the point here, which is instead something different and more problematic IMO. Bit of a worm can in fact.

Steve Bloom has been following the climate change debate for (at least) several years, and works as hard to keep up on the science as any non-scientist I’ve seen. He saw Gill’s tweets and responded, that no, Stein’s statement did not really go so far beyond the extreme scientific estimates. He did not reference some poor or obsolete study by unknown authors from 25 years ago, but rather a long, wide ranging study by James Hansen and others, only a few months old, one that went through an impressive and unique open review process (Peter Thorne was one of the reviewers, and critical of several major aspects of the paper, final review here, and summary of overall review experience here). Their work does indeed place such a high rate of rise within the realm of defensible consideration, depending on glacier and ice sheet dynamics in Greenland and Antarctica, for which they incorporate into their modeling some recent findings on the issue. So, Jill Stein is not so off-the-wall in her comments after all, though she may have exaggerated slightly, and I don’t know where she got the “12.3M homes” figure.

The point is not that James Hansen is the infallible king of climate science, and therefore to be assumed correct. Hanson et al. might be right or they might be wrong, I don’t know. [If they’re right we’re in big trouble]. I wasn’t aware of the study until Steve’s tweeted link, and without question it will take some serious time and work to work through the thing, even just to understand what they claim and how they got there, which is all I can expect to achieve. If I get to it at all that is.

One point is that some weird process has developed, where all of a sudden a number of scientists sort of gang up on some politician or whatever who supposedly said some outrageous thing or other. It’s not scientist A criticizing public person B this week and then scientist C criticizing public person D the next week–it’s a rather predictable group all ganging up on one source, at once. To say the least, this is suspicious behavior, especially given the magnitude of the problems I see within science itself. I do wonder how much of this is driven by climate change “skeptics” complaining about the lack of criticisms of extreme statements in the past.

To me, the bigger problem is that these criticisms are rarely aimed at scientists, but rather at various public persons. Those people are not immune to criticism, far from it. But in many cases, and clearly in this one, things being claimed originate from scientists themselves, in publications, interviews or speeches. For the most part, people don’t just fabricate claims, they derive them from science sources (or what they consider to be such), though they certainly may exaggerate them. If you don’t think the idea of such a rapid rise is tenable, fine…then take Hanson et al. to the cleaners, not Jill Stein. But, unless you are intimately familiar with the several issues involving sea level rise rates, especially ice melt, then you’ve got some very long and serious work ahead of you before you’re in any position to do so. This stuff is not easy or simple and the authors are no beginners or lightweights.

The second issue involves the whole topic of consensus, which is a very weird phenomenon among certain climate scientists (not all, by any means). As expected, when I noted that Stein was indeed basically referencing Hanson et al., I was hit with the basic argument (paraphrased) “well they’re outside of the consensus (and/or IPCC) position, so the point remains”. Okay, aside from the issues of just exactly how this sacred consensus is to be defined anyway… yeah, let’s say they are outside of it, so what? The “consensus position” now takes authority over evidence and reasoning, modeling and statistics, newly acquired data etc., that is, over the set of tools we have for deciding which, of a various set of claims, is most likely correct? Good luck advancing science with that approach, and especially in cases where questionable or outright wrong studies have formed at least part of the basis of your consensus. It’s remarkably similar to Bayesian philosophy–they’re going to force the results from prior studies to be admitted as evidence, like it or not, independent of any assessment of their relative worth. Scientific ghoulash.

And yes, such cases do indeed exist, even now–I work on a couple of them in ecology, and the whole endeavor of trying to clarify issues and correct bad work can be utterly maddening when you have to deal with that basic mindset.

“We live in a Chi-square society due to political correctness”

So, without getting into the reasons, I’m reading through the entry in the International Encyclopedia of Statistical Science on “Statistical Fallacies: Misconceptions and Myths”, written by one “Shlomo Sawilowsky, Professor, Wayne State University, Detroit MI, USA”. Within the entry, 20 such fallacies are each briefly described.

Sawilowsky introduces the topic by stating:

Compilations and illustrations of statistical fallacies, misconceptions, and myths abound…The statistical faux pas is appealing, intuitive, logical, and persuasive, but demonstrably false. They are uniformly presented based on authority and supported based on assertion…these errors spontaneously regenerate every few years, propagating in peer reviewed journal articles…and dissident literature. Some of the most egregious and grievous are noted below.

Great, let’s get after it then.

He then gets into his list, which proceeds through a set of +/- standard types of issues, including misunderstanding of the Central Limit Theorem, Type I errors, p values, effect sizes and etc. Up comes item 14:

14. Chi-square
(a) We live in a Chi-square society due to political correctness that dictates equality of outcome instead of equality of opportunity. The test of independence version of this statistic is accepted sans voire dire by many legal systems as the single most important arbiter of truth, justice, and salvation. It has been asserted that any statistical diffšerence between (oftŸen even nonrandomly selected) samples of ethnicity, gender, or other demographic as compared with (oftŸen even inaccurate, incomplete, and outdated) census data is primae faciea evidence of institutional racism, sexism, or other ism. A plaintiffš allegation that is supportable by a significant Chi-square is oŸften accepted by the court (judges and juries) praesumptio iuris et de iure. Similarly, the goodness of fit version of this statistic is also placed on an unwarranted pedestal.

Bingo Shlomo!!

Now this is exactly what I want from my encyclopedia entries: a strictly apolitical, logical description of the issue at hand. In fact, I hope to delve deep into other statistical writings of Dr. Sawilowsky to gain, hopefully, even better insights than this one.

Postscript: I’m not really bent out of shape on this, and would indeed read his works (especially this one: Sawilowsky, S. (2003) Deconstructing arguments from the case against hypothesis testing. J. Mod. Appl. Stat. Meth. 2(2):467-474). I can readily overlook ideologically driven examples like this to get at the substance I’m after, but I do wonder how a professional statistician worked that into an encyclopedia entry.

I note also that the supposed “screening fallacy” popular on certain blogs is not included in the list…and I’m not the least bit surprised.