This week’s puzzler comes to us from John Storthwaite in Stonyfield, Minnesota, who has been wondering why there are so many trees blocking his view of the rocks up there.

Suppose you have been given the following problem. A number of objects are located in some given area, say trees in a forest for example, and one wishes to estimate their density D (number per unit area). Distance-based sampling involves estimating D by averaging a sample of squared, point-to-object distances (d), for objects of known *integer rank* distance (r) from the point. The distances are squared because one is converting from one dimensional measurements (distance) to a two dimensional variable (objects per unit area).

So here’s the puzzler. If you run a line through this arbitrary point, and choose the closest objects (r = 1) on each side of it, what will be the ratio of the squared distances of the two objects and how would you solve this, analytically? Would they be about the same distance away? If not, would there be a predictable relationship between them? The problem can be extended to any number of lines passing through said point, just with correspondingly more pairs of distances to evaluate.

The first questions one should ask here are clear: (1) “*Why on earth would anybody want to do that?*” and (2) “*Is that the type of thing you clowns spend your time on?*“. We have answers for those questions. Not necessarily satisfactory answers, but answers nonetheless. Giving an answer, that’s the important thing in life. So, if you know the answer, write it on the back of a $100 bill and send it to…

Anyway, there are two possible solutions here. The first one comes readily if one realizes that the densities within sectors must each be about the same as the overall density, since we assume a homogeneous overall density. But, for a given value of r, the squared distances in each of the two sectors must be, on average, about twice those for the collection of trees overall, because there are only half as many trees in each sector as there are overall. So, e.g. the r = 5th closest trees *within each half* are on average, 2X the squared distance of the r = 5th closest tree *overall*.

Knowing this, the relationship between the two r = 1 trees (label them r1.1 and r1.2 having squared distances d1.1 and d1.2) in the two sectors becomes clear. Since one of the two trees (r1.1) must necessarily be the r = 1 tree *overall*, and the mean squared distance of the two trees must be 2X that of the r = 1 tree, this translates to:

*2*d1.1 = (d1.1 + d1.2)/2 *and thus,

*d1.2 = 3(d1.1),*

i.e., one member of the pair will, on average, be exactly three times the squared distance of the other. This result can be confirmed by an entirely independent method involving asymptotic binomial/multinomial probability. That exercise is left, as they say in the ultimate cop-out, to the reader.

This work has highly important implications with respect to a cancer research, and for solutions to poverty, malnutrition, and climate change. It can also help one discern if tree samplers 150-200 years ago were often sampling the closest trees or not.

Funding for this work was provided by the Doris Duke Foundation, the Society for American Baseball Research, the American Bean and Tree Counters Society, the Society for Measuring Things Across From Other Things, and the Philosophy Department at the University of Hullaballo. All rights reserved, all obligations denied. Any re-use, re-broadcast, retransmission, regurgitation or other use of the accounts and descriptions herein, without the express written consent of the closest random stranger on the street, or the closest random stranger on the other side of said street, is strictly prohibited.

Thanks Jim, that was a fun mental exercise. I have a general question about the nature of using “ranks,” because I’ve heard mixed opinions on using them. When do you think it is appropriate and inappropriate to use ranking in analyses? This example, for instance, is predicated on the idea that we know the ranking and/or it would be simple to obtain. There are many cases in which this it is feasible and many that are less so (e.g., something that is not dense or easily observable, such as a grass species in a meadow).

Hi seeddispersal. Are you referring to the use of ranks in the estimation of density? I’m a little confused because there’s no way to estimate density from distances if you don’t know the rank order of the objects measured to. You need both.

Yes, I was referring to ranks in density estimation. I guess that I just did not realize that density cannot be estimated without ranks! I have never tried to estimate density, but geometrically, it seems like there would be many ways. Never the less, thanks for the post!

No problem sd. The topic’s primarily of interest to forest ecologists who have to think about these things because it’s typically much more efficient to do distance-based sampling than plot-based when dealing with trees. And there are indeed many ways, but one always has to know the ranks.

Another interesting point is that if one chooses the 2nd closest object, instead of the 1st, on each side of a line, the estimated density is 3X, not 2X, as great.

Dear Bruce:

Your acknowledgements section above is so excellent we’ve moved to nominate you for a Globby (Global Blog Emmy thingy). If your entry is successful you’ll win an all expense paid trip to Australia to receive the award in person from Bruce, Bruce and the other Bruce. I’m told Allman Brothers CDs are available during the flight. They really watch out for your comfort.

Yours,

Bruce

Bruce you don’t seem to understand that I’m a very rich man and fly my personal jet to Oz routinely just to get some decent beer and have a bit of a walkabout through the outback.