Well we’re long overdue for another installment of “Ask the Self-Appointed Experts“, or at least for the question part. In today’s edition a follower from Two Forks Montana wrestles with the following conundrum, inviting others to help, or at least reassure him that he is not confused alone. He writes:
I know this issue is all over AM talk radio but the inmates pretty clearly run the asylum there and I’m more confused on the following issue than ever.
It is well known that, given a known rate process, the gamma distribution defines the expected values from random starting points to the nth “nearest” object, and so inversely, we can estimate unknown rates by such values. For example, in a forest of randomly distributed trees, the circular area, a, defined by the distance to the nth closest tree, will estimate tree density. But as Skellam (1952), Moore (1954) and Pollard (1971) showed analytically, these estimates are biased, in inverse magnitude to the value of n, specifically, as n/(n-1) for n > 1. Thus, the distance to, say, the 2nd closest tree will correspond to the area represented by one tree, not two. All very well and good.
Now, the mean of the integration of the gamma distribution from 0 to 1, for a known rate, should return the mean area a, but when I closely approximate the integral (in R, which can’t integrate), I seem to get bias-corrected values reflecting the rates, rather than the biased values reflecting the areas (a) to the nth object. I’m flummoxed and not a little aggravated. Do they know what they’re doing there at R headquarters, or is it me that’s got all turned about the wrong way? If I can’t even trust the values from the statistical distributions in R, then just what can I trust there? I tried taking my mind off the matter by following the Winnipeg Jets (WJPCA, 2015), but man, one can just take only so much of that and I sure as hell ain’t going to follow Edmonton. The ice fishing seems to help, at least until the alcohol wears off, but really there should be an 800 number I think. If you can put me onto some clues I would be most grateful.
My R code and results (for the n = 2 object):probs = seq(from = 0.000001, to = 0.999999, by = 0.000001) # evenly spaced probability steps mean.area = mean(qgamma(p=probs, shape=2, rate = 1, lower.tail = T)) # approximate the pdf integral, as the mean of the sampled distribution  1.999993
1.999993, WTF !??!
Skellam, J.G. 1952. Studies in statistical ecology: I. Spatial Pattern. Biometrika 39:346-362
Moore, P.G. 1954. Spacing in plant populations. Ecology 35:222-227.
Pollard, J.H. 1971. On distance estimators of density in randomly distributed forests. Biometrics 27:991-1002
WJPCA, 2015. Your 2015-2016 Winnipeg Jets: Fast, Furious and Fun. Winnipeg Jets Promotional Coordinating Association.
Stewart Stansbury, Two Forks MT (“not quite Canada but roughly as empty”), USA
p.s. Jets picture enclosed