On clustering, part two

In part one of what has quickly developed into an enthralling series, I made the points that (1) at least some important software doesn’t provide a probability value for cluster outputs, and (2) that while it’s possible to cluster any data set, univariate or multivariate, into clearly distinct groups, so doing doesn’t necessarily mean anything important. Such outputs only tell us something useful if there is some actual structure in the data, and the clustering algorithm can detect it.

But just what is “structure” in the data? The univariate case is simplest because with multivariate data, structure can have two different aspects. But in either situation we can take the standard statistical stance that structure is the detectable departure from random expectation, at some defined probability criterion (p value). The Poisson and gamma distributions define this expectation, the former for count (integer valued) data, and the latter for continuous data. By “expectation” I mean the expected distribution of values across the full data. If we have a calculated overall mean value, i.e. an overall rate, the Poisson and gamma then define this distribution, assuming each value is measured over an identical sampling interval. With the Poisson, the latter takes the form of a fixed denominator, whereas with the gamma it takes the form of a fixed numerator.

Using the familiar example of density (number of objects per unit space or time), the Poisson fixes the unit space while the integer number of objects in each unit varies, whereas the gamma fixes the integer rank of the objects that will be measured to from random starting points, with the distance to each such object (and corresponding area thereof) varying. The two approaches are just flip sides of the same coin really, but with very important practical considerations related to both data collection and mathematical bias. Without getting heavily into the data collection issue here, the Poisson approach–counting the number of objects in areas of pre-defined size–can get you into real trouble in terms of time efficiency (regardless of whether density tends to be low or high). This consideration is very important in opting for distance-based sampling and the use of the gamma distribution over area-based sampling and use of the Poisson.

But returning to the original problem as discussed in part one, the two standard clustering approaches–k-means and hierarchical–are always going to return groupings that are of low probability of random occurrence, no matter what “natural structure” in the data there may actually be. The solution, it seems to me, is to instead evaluate relative probabilities: the probability of the values within each group being Poisson or gamma distributed, relative to the probability of the overall distribution of values. In each case these probabilities are determined by a goodness-of-fit test, namely a Chi-square test for count (integer) data and a Kolmogorov-Smirnov test for continuous data. If there is in fact some natural structure in the data–that is, groups of values that are overly similar (or dissimilar) to each other than that defined by the Poisson or gamma–then this relative probability (or odds ratio if you like), will be maximized at the clustering solution that most closely reflects the actual structure in the data, this solution being defined by (1) the number of groups, and (2) the membership of each. It is a maximum likelihood approach to the problem.

If there is little or no actual structure in the data, then these odds ratios computed across different numbers of final groups will show no clearly defensible maximal value, but rather a broad, flat plateau in which all the ratios are similar, varying from each other only randomly. But when there is real structure therein, there will be a ratio that is quantifiably higher than all others, a unimodal response with a peak value. The statistical significance of this maximum can be evaluated with the Likelihood Ratio test or something similar, though I haven’t thought very hard about that issue yet.

Moving from the univariate case, to the multivariate, ain’t not no big deal really, in terms of the above discussion–it just requires averaging those odds ratios over all variables. But multivariate data does introduce a second, subtle aspect into what we mean by the term “data structure”, in the following respect. It is a possible situation wherein no variable in the data shows clear evidence of structure, per the above approach, when in fact there very much is such, but of a different kind. That outcome would occur whenever particular pairs (or larger groups) of variables are correlated with each other (above random expectation), even though the values for each such variable are in fact Poisson/gamma distributed overall. That is, there is a statistically defensible relationship between variables across sample units, but no detectable variation in values within each variable, across those sample units.

Such an outcome would provide definite evidence of behavioral similarity among variables even in the absence of a structuring of those variables by some latent (unmeasured) variable. I think it would be interesting to know how often such a situation occurs in different types of ecological and other systems, and I’m pretty sure nobody’s done any such analysis. Bear in mind however that I also once thought, at about 4:30 AM on a sleep deprived week if I remember right, that it would be interesting to see if I could beat the Tahoe casinos at blackjack based on quick probability considerations.

I hope the above has made at least some sense and you have not damaged your computer by say, throwing your coffee mug through the screen, or yelled something untoward, at volume, within earshot of those who might take offense. The Institute hereby disavows any responsibility, liability or other legal or financial connection to such events, past or future.

There will be more!

On clustering, part one

In ecology and other sciences, grouping similar objects together for further analytical purposes, or just as an end in itself, is a fundamental task, one accomplished by cluster analysis, one of the most fundamental tools in statistics. In all but the smallest sample sizes, the number of possible groupings very rapidly gets enormous, and it is necessary therefore to both (1) have some way of efficiently avoiding the vast number of clearly non-optimal clusters, and (2) choosing the best solution from among those that seem at least reasonable.

First some background. There are (at least) three basic approaches to clustering. Two of these are inherently hierarchical in nature: they either aggregate individual objects into ever-larger groups (agglomerative methods), or successively divide the entire set into ever smaller ones (divisive methods). Hierarchical methods are based on a distance matrix that defines the distance (in measurement space) between every possible pair of objects, as determined by the variables of interest (typically multivariate) and the choice of distance measure, of which there are several depending on one’s definitions of “distance”. This distance matrix increases in size as a function of (n-1)(n/2), or roughly a squared function of n, and so for large datasets these methods quickly become untenable, unless one has an enormous amount of computer memory available, which typically the average scientist does not.

The k-means clustering algorithm works differently–it doesn’t use a distance matrix. Instead it chooses a number of random cluster starting points (“centers”) and then measures the distance to all objects from those points, and agglomerates stepwise according to which objects are closest to which centers. This greatly reduces the memory requirement for large data sets, but a drawback is that the output depends on the initial choice of centers; one should thus try many different starting combinations, and even then, the best solution is not guaranteed. Furthermore, one sets the number of final clusters desired beforehand, but there is no guarantee that the optimal overall solution will in fact correspond to that choice, and so one has to repeat the process for all possible cluster numbers that one deems reasonable, with “reasonable” often being less than obvious.

When I first did a k-means cluster analysis, years ago, I did it in SPSS and I remember being surprised that the output did not include a probability value, that is, the likelihood of obtaining a given clustering by chance alone. There was thus no way to determine which among the many possible solutions was in fact the best one, which seemed to be a pretty major shortcoming, possibly inexcusable. Now I’m working in R, and I find…the same thing. In R, the two workhorse clustering algorithms, both in the main stats package are kmeans and hclust, corresponding to k-means and hierarchical clustering, respectively. In neither method is the probability of the solution given as part of the output. So, it wasn’t just SPSS–if R doesn’t provide it, then it’s quite possible that no statistical software program (SAS, S-Plus, SigmaStat, etc.) does so, although I don’t know for sure.

There is one function in R that attempts to identify what it calls the “optimal clustering”, function optCluster in the package of the same name. But that function, while definitely useful, only appears to provide a set of different metrics by which to evaluate the effectiveness of any given clustering solution, as obtained from 16 possible clustering methods, but with no actual probabilities attached to any of them. What I’m after is different, more defensible and definitely more probabilistic. It requires some careful thought regarding just what clustering should be all about in the first place.

If we talk about grouping objects together, we gotta be careful. This piece at Variance Explained gives the basic story of why, using examples from a k-means clustering. A principal point is that one can create clusters from any data set, but the result doesn’t necessarily mean anything. And I’m not just referring to the issue of relating the variable being clustered to other variables of interest in the system under study. I’m talking about inherent structure in the data, even univariate data.

This point is easy to grasp with a simple example. If I have the set of 10 numbers from 0 to 9, a k-means clustering into two groups will place 0 to 4 in one group and 5 to 9 in the other, as will most hierarchical clustering trees trimmed to two groups. Even if some clustering methods were to sometimes place say, 0 to 3 in one group and 4 to 9 in the other, or similar outcome (which they conceivably might–I haven’t tested them), the main point remains: there are no “natural” groupings in those ten numbers–they are as evenly spaced as is possible to be, a perfect gradient. No matter how you group them, the number of groups and the membership of each will be an arbitrary and trivial result. If, on the other hand, you’ve got the set {0,1,2,7,8,9} it’s quite clear that 0-2 and 7-9 define two natural groupings, since the members of each group are all within 1 unit of the means thereof, and with an obvious gap between the two.

This point is critical, as it indicates that we should seek a clustering evaluation method that is based in an algorithm capable of making this discrimination between a perfect gradient and tightly clustered data. Actually it has to do better than that–it has to be able to distinguish between perfectly spaced data, randomly spaced data, and clustered data. Randomly spaced data will have a natural degree of clustering by definition, and we need to be able to distinguish that situation from truly clustered data, which might not be so easy in practice.

There are perhaps several ways to go about doing this, but the one that is most directly obvious and relevant is based on the Poisson distribution. The Poisson defines the expected values in a set of sub-samples, given a known value determined from the entire object collection, for the variable of interest. Thus, from the mean value over all objects (no clustering), we can determine the probability that the mean values for each of the n groups resulting from a given clustering algorithm (of any method), follow the expectation defined by the Poisson distribution determined by that overall mean (the Poisson being defined by just one parameter). The lower that probability is, the more likely that the clusters returned by the algorithm do in fact represent a real feature of the data set, a natural aggregation, and not just an arbitrary partitioning of random or gradient data. Now maybe somebody’s already done this, I don’t know, but I’ve not seen it in any of the statistical software I’ve used, including R’s two workhorse packages stats and cluster.

More hideous detail to come so take cover and shield your eyes.

A massive mess of old tree data

I’m going to start focusing more on science topics here, as time allows. I’ll start by focusing for a while on some forest ecology topics that I’ve been working on, and/or which are closely related to them.

I’m working on some forest dynamics questions involving historical, landscape scale forest conditions and associated fire patterns. I just got done assembling a tree demography database of about 130,000 trees collected in about 1700 plots, in the early 20th century, on the Eldorado and Stanislaus National Forests (ENF, SNF), the two National Forests that occupy the mid- to upper-elevations on the relatively gradual western slope of the central Sierra Nevada. The data were collected primarily between 1911 and 1923 as censuses of large plots (by today’s standards, each ~2 or 4 acres) as part of the first USFS timber inventories, when it was still trying to figure out just what it had on its hands, and how it would manage it over time. An enormous amount of work was involved in this effort, but only a small part of these data has apparently survived.

The data are “demographic” in that the diameter and taxon were recorded for most trees, making them useful for a number of analytical purposes in landscape, community and population ecology. They come from two datasets that I discovered between 1997 and 2001, one in the ENF headquarters building, and the other in the National Archives facility in San Bruno CA. For each, I photocopied the data at that time, and had some of it entered into a database, hoping that I would eventually get time to analyze them. For the ENF data, this was a fortunate decision, because the ENF, as I later learned, has managed in the mean time to lose the entire data set, most likely along with a bunch of other valuable stuff that was in the office housing it. I thus now have the only known backup. Anyway, that time finally came, but the data were in such a mess that I first had to spend about three months checking and cleaning them before they could be analyzed. The data will soon be submitted as a data paper to the journal Ecology, it being one of the very few journals that has adopted this new paper format. In a data paper, one simply presents and describes a data set deemed to be of value to the general scientific community. There is in fact a further mountain of data and other information beyond these, but whether they’ll ever see the light of publication is uncertain.

An example first page of one of many old field reports and data summaries involved

An example first page of one of many old field reports and data summaries involved

We, and others, are interested in these data for estimating landscape scale forest conditions before they were heavily altered by humans via changed natural fire regimes, logging, and grazing (primarily). These changes began in earnest after about 1850, and have generally increased with time. This knowledge can help inform some important current questions involving forest restoration and general ecosystem stability, including fire and hydrologic regimes, timber production potential, biological diversity, and some spin off topics like carbon dynamics. They can directly address some claims that have been made recently regarding the pre-settlement fire regimes in California and elsewhere, in certain papers.

The data assembly was much slower and more aggravating than expected–I won’t go into it but I’ll never do it again–but the analysis is, and will be, very interesting for quite some time, as much can be done with it. Some of the summary or explanatory documentation associated with the data is entirely fascinating, as is some of the other old literature and data that I’ve been reading over as part of the project. In fact I’m easily distracted into reading more of it than is often strictly necessary, but so doing has reminded me that a qualitative, verbal description can be of much greater value than actual data, scientific situation depending. Possibly the most interesting and important aspect to this is the degree to which really important information has been either lost, completely forgotten about, or never discovered to begin with. This is not trivial–I’m talking about a really large amount of detailed data and extensive, detailed summary documentation. Early views and discussions regarding fire and forest management, and the course these should take in CA, are extensive and very revealing, as we now look back 100 years later on the effects of important decisions made then. There are also lessons in federal archiving and record keeping.

I’ll be posting various things as time allows, including discussions of methods and approaches in this type of research. I’m also applying for a grant to cover the cost of free pizza at the end, although to be honest I’ve not had great success on same in the past. You might be surprised at the application numbers and success rates on that kind of thing.

Rank Stranger

I wandered again to my home in the mountains
Where in youth’s early dawn I was happy and free
I looked for my friends but I never could find them
I found they were all rank strangers to me

Everybody I met seemed to be a rank stranger
No mother nor dad, not a friend could I see
They knew not my name, and I knew not their faces
I found they were all rank strangers to me

Ralph Stanley, Rank Stranger

My take:

“Cincinnati, March 22nd, 1814.”

I am by no means whatsoever an expert on American government policies regarding Native Americans. So just where the following extract fits in to the bigger picture thereof I don’t really know, but based on considerations such as date, location, and people involved, it seems to describe an important set of decisions, possibly precedent setting. It is taken from a letter from General William Henry Harrison, to the Secretary of War, during the War of 1812. Harrison had been Territorial Governor of Indiana before the war, and had served in Anthony Wayne’s army back in the 1794 campaign through western Ohio that led to the Treaty of Greenville in 1795, two very important events in establishing policies between the United States and Native Americans, generally.

Harrison may well have had a better understanding of the recent geographic history of Native American tribes–and certainly regarding their various warfare methods–in the large midwestern area centered on what is now Indiana, and it’s principal river (the Wabash), than any other person of the time. He was also the main actor in dealing with Tecumseh, arguably the greatest Native American strategist ever, in what must have been a fascinating real-life drama. The focus of the letter is on just which tribes had legitimate, long-standing land tenure claims, and thus, the right to negotiate and sell their lands, thereby countering the grand unification strategy of Tecumseh. The full letter is reproduced here: McAfee (1816). History of the Late War in the Western Country, pp 53-58; the [] and bolds being my edits.

Continue reading

Range o’ Light

Not sure I ever needed to be reading anything else really, although I have pulled some rather great historical stuff out of Google Books recently so hurray for the internet I guess. And I don’t know what hoops Stephen Whitney had to jump through do to get that picture of lodgepole pine bark on his cover but man do I love it.

IMG_20160319_0003
IMG_20160319_0002
IMG_20160319_0005
IMG_20160319_0027
IMG_20160319_0059
IMG_20160319_0026
IMG_20160319_0024
IMG_20160321_0004
IMG_20160319_0045
IMG_20160319_0006
IMG_20160321_0001
IMG_20160319_0037
IMG_20160319_0015
IMG_20160319_0004
IMG_20160319_0021
IMG_20160327_0006
IMG_20160327_0005
IMG_20160327_0007
IMG_20160319_0052
IMG_20160319_0036

“Fearfully wild, with a blaze of quick electric light in his dark eye”

Never in several lifetimes of dreams and visions will I ever tire of reading the works of this man.

Visalia is the name of a small town embowered in oaks upon the Tulare Plain in Middle California, where we made our camp one May evening of 1864. Professor Whitney, our chief, the State Geologist, had sent us out for a summer’s campaign in the High Sierras, under the lead of Professor William H. Brewer, who was more sceptical than I as to the result of the mission.

Several times during the previous winter Mr. Hoffman and I, while on duty at the Mariposa gold-mines, had climbed to the top of Mount Bullion, and gained, in those clear January days, a distinct view of the High Sierra, ranging from the Mount Lyell group many miles south to a vast pile of white peaks, which, from our estimate, should lie near the heads of the King’s and Kaweah rivers. Of their great height I was fully persuaded; and Professor Whitney, on the strength of these few observations, commissioned us to explore and survey the new Alps.

We numbered five in camp:—Professor Brewer; Mr. Charles F. Hoffman, chief topographer; Mr. James T. Gardiner, assistant surveyor; myself, assistant geologist; and our man-of-all-work, to whom science already owes its debts.

When we got together our outfit of mules and equipments of all kinds, Brewer was going to reengage, as general aid, a certain Dane, Jan Hoesch, who, besides being a faultless mule-packer, was a rapid and successful financier, having twice, when the field-purse was low and remittances delayed, enriched us by what he called “dealing bottom stock” in his little evening games with the honest miners. Not ungrateful for that, I, however, detested the fellow with great cordiality. “If I don’t take him, will you be responsible for packing mules and for daily bread?” said Brewer to me, the morning of our departure from Oakland. “I will.” “Then we’ll take your man Cotter; only, when the pack-saddles roll under the mules’ bellies, I shall light my pipe and go botanizing. Sabe?”

So my friend, Richard Cotter, came into the service, and the accomplished but filthy Jan opened a poker and rum shop on one of the San Francisco wharves, where he still mixes drinks and puts up jobs of “bottom stock.” Secretly I longed for him as we came down the Pacheco Pass, the packs having loosened with provoking frequency.

The animals of our small exploring party were upon a footing of easy social equality with us. All were excellent except mine. The choice of Hobson (whom I take to have been the youngest member of some company) falling naturally to me, I came to be possessed of the only hopeless animal in the band. Old Slum, a dignified roan mustang of a certain age, with the decorum of years and a conspicuous economy of force retained not a few of the affectations of youth, such as snorting theatrically and shying, though with absolute safety to the rider, Professor Brewer. Hoffman’s mount was a young half-breed, full of fire and gentleness. The mare Bess, my friend Gardiner’s pet, was a light-bay creature, as full of spring and perception as her sex and species may be. A rare mule, Cate, carried Cotter. Nell and Jim, two old geological mules, branded with Mexican hieroglyphics from head to tail, were bearers of the loads.

My Buckskin was incorrigibly bad. To begin with, his anatomy was desultory and incoherent, the maximum of physical effort bringing about a slow, shambling gait quite unendurable. He was further cursed with a brain wanting the elements of logic, as evinced by such non sequiturs as shying insanely at wisps of hay, and stampeding beyond control when I tried to tie him to a load of grain. My sole amusement with Buckskin grew out of a psychological peculiarity of his, namely, the unusual slowness with which waves of sensation were propelled inward toward the brain from remote parts of his periphery. A dig of the spurs administered in the flank passed unnoticed for a period of time varying from twelve to thirteen seconds, till the protoplasm of the brain received the percussive wave; then, with a suddenness which I never wholly got over, he would dash into a trot, nearly tripping himself up with his own astonishment.

A stroke of good fortune completed our outfit and my happiness by bringing to Visalia a Spaniard who was under some manner of financial cloud. His horse was offered for sale, and quickly bought for me by Professor Brewer. We named him Kaweah, after the river and its Indian tribe. He was young, strong, fleet, elegant, a pattern of fine modelling in every part of his bay body and fine black legs; every way good, only fearfully wild, with a blaze of quick electric light in his dark eye.

Shortly after sunrise one fresh morning we made a point of putting the packs on very securely, and, getting into our saddles, rode out toward the Sierras. The group of farms surrounding Visalia is gathered within a belt through which several natural, and many more artificial, channels of the Kaweah flow. Groves of large, dark-foliaged oaks follow this irrigated zone; the roads, nearly always in shadow, are flanked by small ranch-houses, fenced in with rank jungles of weeds and rows of decrepit pickets.

Our backs were now turned to this farm-belt, the road leading us out upon the open plain in our first full sight of the Sierras. Grand and cool swelled up the forest; sharp and rugged rose the wave of white peaks, their vast fields of snow rolling over the summit in broad, shining masses. Sunshine, exuberant vegetation, brilliant plant life, occupied our attention hour after hour until the middle of the second day. At last, after climbing a long, weary ascent, we rode out of the dazzling light of the foot-hills into a region of dense woodland, the road winding through avenues of pines so tall that the late evening light only came down to us in scattered rays. Under the deep shade of these trees we found an air pure and gratefully cool.

Passing from the glare of the open country into the dusky forest, one seems to enter a door and ride into a vast covered hall. The whole sensation is of being roofed and enclosed. You are never tired of gazing down long vistas, where, in stately groups, stand tall shafts of pine. Columns they are, each with its own characteristic tinting and finish, yet all standing together with the air of relationship and harmony. Feathery branches, trimmed with living green, wave through the upper air, opening broken glimpses of the far blue, and catching on their polished surfaces reflections of the sun. Broad streams of light pour in, gilding purple trunks and falling in bright pathways along an undulating floor. Here and there are wide, open spaces, around which the trees group themselves in majestic ranks.

Our eyes often ranged upward, the long shafts leading the vision up to green, lighted spires, and on to the clouds. All that is dark and cool and grave in color, the beauty of blue umbrageous distance, all the sudden brilliance of strong local lights tinted upon green boughs or red and fluted shafts, surround us in ever-changing combination as we ride along these winding roadways of the Sierra.

We had marched a few hours over high, rolling, wooded ridges, when in the late afternoon we reached the brow of an eminence and began to descend. Looking over the tops of the trees beneath us, we saw a mountain basin fifteen hundred feet deep surrounded by a rim of pine-covered hills. An even, unbroken wood covered these sweeping slopes down to the very bottom, and in the midst, open to the sun, lay a circular green meadow, about a mile in diameter.

As we descended, side wood-tracks, marked by the deep ruts of timber wagons, joined our road on either side, and in the course of an hour we reached the basin and saw the distant roofs of Thomas’s Saw-Mill Ranch. We crossed the level disc of meadow, fording a clear, cold mountain stream, flowing, as the best brooks do, over clean, white granite sand, and near the northern margin of the valley, upon a slight eminence, in the edge of a magnificent forest, pitched our camp.

The hills to the westward already cast down a sombre shadow, which fell over the eastern hills and across the meadow, dividing the basin half in golden and half in azure green. The tall young grass was living with purple and white flowers. This exquisite carpet sweeps up over the bases of the hills in green undulations, and strays far into the forest in irregular fields. A little brooklet passed close by our camp and flowed down the smooth green glacis which led from our little eminence to the meadow. Above us towered pines two hundred and fifty feet high, their straight, fluted trunks smooth and without a branch for a hundred feet. Above that, and on to the very tops, the green branches stretched out and interwove, until they spread a broad, leafy canopy from column to column.

Professor Brewer determined to make this camp a home for the week during which we were to explore and study all about the neighborhood. We were on a great granite spur, sixty miles from east to west by twenty miles wide, which lies between the Kaweah and King’s River cañons. Rising in bold sweeps from the plain, this ridge joins the Sierra summit in the midst of a high group. Experience had taught us that the cañons are impassable by animals for any great distance; so the plan of campaign was to find a way up over the rocky crest of the spur as far as mules could go.

In the little excursions from this camp, which were made usually on horseback, we became acquainted with the forest, and got a good knowledge of the topography of a considerable region. On the heights above King’s Cañon are some singularly fine assemblies of trees. Cotter and I had ridden all one morning northeast from camp under the shadowy roof of forest, catching but occasional glimpses out over the plateau, until at last we emerged upon the bare surface of a ridge of granite, and came to the brink of a sharp precipice. Rocky crags lifted just east of us. The hour devoted to climbing them proved well spent.

A single little family of alpine firs growing in a niche in the granite surface, and partly sheltered by a rock, made the only shadow, and just shielded us from the intense light as we lay down by their roots. North and south, as far as the eye could reach, heaved the broad, green waves of plateau, swelling and merging through endless modulation of slope and form.

Conspicuous upon the horizon, about due east of us, was a tall, pyramidal mass of granite, trimmed with buttresses which radiated down from its crest, each one ornamented with fantastic spires of rock. Between the buttresses lay stripes of snow, banding the pale granite peak from crown to base. Upon the north side it fell off, grandly precipitous, into the deep upper cañon of King’s River. This gorge, after uniting a number of immense rocky amphitheatres, is carved deeply into the granite two and three thousand feet. In a slightly curved line from the summit it cuts westward through the plateau, its walls, for the most part, descending in sharp, bare slopes, or lines of ragged débris, the resting-place of processions of pines. We ourselves were upon the brink of the south wall; three thousand feet below us lay the valley, a narrow, winding ribbon of green, in which, here and there, gleamed still reaches of the river. Wherever the bottom widened to a quarter or half a mile, green meadows and extensive groves occupied the level region. Upon every niche and crevice of the walls, up and down sweeping curves of easier descent, were grouped black companies of trees.

The behavior of the forest is observed most interestingly from these elevated points above the general face of the table-land. All over the gentle undulations of the more level country sweeps an unbroken covering of trees. Reaching the edge of the cañon precipices, they stand out in bold groups upon the brink, and climb all over the more ragged and broken surfaces of granite. Only the most smooth and abrupt precipices are bare. Here and there a little shelf of a foot or two in width, cracked into the face of the bluff, gives foothold to a family of pines, who twist their roots into its crevices and thrive. With no soil from which the roots may drink up moisture and absorb the slowly dissolved mineral particles, they live by breathing alone, moist vapors from the river below and the elements of the atmosphere affording them the substance of life.

Ask the experts, part n

Well we’re long overdue for another installment of “Ask the Self-Appointed Experts“, or at least for the question part. In today’s edition a follower from Two Forks Montana wrestles with the following conundrum, inviting others to help, or at least reassure him that he is not confused alone. He writes:

I know this issue is all over AM talk radio but the inmates pretty clearly run the asylum there and I’m more confused on the following issue than ever.

It is well known that, given a known rate process, the gamma distribution defines the expected values from random starting points to the nth “nearest” object, and so inversely, we can estimate unknown rates by such values. For example, in a forest of randomly distributed trees, the circular area, a, defined by the distance to the nth closest tree, will estimate tree density. But as Skellam (1952), Moore (1954) and Pollard (1971) showed analytically, these estimates are biased, in inverse magnitude to the value of n, specifically, as n/(n-1) for n > 1. Thus, the distance to, say, the 2nd closest tree will correspond to the area represented by one tree, not two. All very well and good.

Now, the mean of the integration of the gamma distribution from 0 to 1, for a known rate, should return the mean area a, but when I closely approximate the integral (in R, which can’t integrate), I seem to get bias-corrected values reflecting the rates, rather than the biased values reflecting the areas (a) to the nth object. I’m flummoxed and not a little aggravated. Do they know what they’re doing there at R headquarters, or is it me that’s got all turned about the wrong way? If I can’t even trust the values from the statistical distributions in R, then just what can I trust there? I tried taking my mind off the matter by following the Winnipeg Jets (WJPCA, 2015), but man, one can just take only so much of that and I sure as hell ain’t going to follow Edmonton. The ice fishing seems to help, at least until the alcohol wears off, but really there should be an 800 number I think. If you can put me onto some clues I would be most grateful.

My R code and results (for the n = 2 object):

probs = seq(from = 0.000001, to = 0.999999, by = 0.000001)		# evenly spaced probability steps
mean.area = mean(qgamma(p=probs, shape=2, rate = 1, lower.tail = T))	# approximate the pdf integral, as the mean of the sampled distribution
[1] 1.999993

1.999993, WTF !??!

References:
Skellam, J.G. 1952. Studies in statistical ecology: I. Spatial Pattern. Biometrika 39:346-362
Moore, P.G. 1954. Spacing in plant populations. Ecology 35:222-227.
Pollard, J.H. 1971. On distance estimators of density in randomly distributed forests. Biometrics 27:991-1002
WJPCA, 2015. Your 2015-2016 Winnipeg Jets: Fast, Furious and Fun. Winnipeg Jets Promotional Coordinating Association.

Sincerely,
Stewart Stansbury, Two Forks MT (“not quite Canada but roughly as empty”), USA

p.s. Jets picture enclosed

Jets player checking imaginary opponent hard into the glass, to the delight of all

Jets player checking imaginary opponent hard into the glass, to the delight of all

The devil’s real

The devil ain’t a legend; the devil’s real
In the empty way he touched me, where I hardly feel
In the empty hole inside me, the nothin’ that will drive me
Down into my grave: it does not heal
Nothing is a something, and it will suck you dry
Like the whisper you can hardly hear that tells you why

They told me “you ain’t got no problems, you’re self deceived;
These seeming contradictions, well they’re make-believe”
It was then that I decided that my life was being guided
By a second rate dependence on some first class thieves
They told me I was breakin’ through, but I was breakin’ down
And by the time I learned the difference they had long left town

You know they ain’t so malicious, they ain’t mean
They’re just vaguely well intentioned with no love I’ve seen
And its the emptiness that kills you, cold comfort that will fill you
With a sense of dread that maybe things are worse than they seem
They don’t tell you nothin’ you don’t already know
They just keep holding out a promise…but they don’t let go
You know they don’t let go

Well it was hard luck and trouble–bad times two
I know I had it comin’, but I got through
It was advice that you gave me, in a dream, that saved me
You said “get a new life contract that spells out your dues”
It took good will to find it, and a clear conscience to sign it
But now I dream about the good times and it all comes true
Yeah I dream about the good times and it all comes true

The Devil’s Real, Chris Smither

I had something

I had something–it fell from me
Something strong, like a pounding drum
Like ringing bells, when I was young
I had something, then it was gone

I had something–it made me walk all night
Made me run from home, made me fight
I had something, made me feel alone
Like an orphan waiting for a home

Every footstep that I take
Completes the circle my life makes
Every living thing has ties that bind
What I lost, returns with love in time

I heard something–it called to me
And it told me I was saved
Not by God and not by works
Not by any living thing

It was a voice that I once knew
Of my daughter, or my son
Not yet born, not yet known
Another orphan waiting for a home

Every footstep that I take
Completes the circle my life makes
Every living thing has ties that bind
What I lost, returns with love in time

Lucy Kaplansky, I had something

The puppets heave rocks

A terrific cover, IMO, of this fairly unknown Bob Dylan gem, built around B and E flats, in an alternate tuning; an Alex de Grassi / Windham Hill kind of sound:

Farewell Angelina, the bells of the crown
Are being stolen by bandits, I must follow the sound
The triangle tingles, the trumpets play slow
Farewell Angelina, the sky is on fire, and I must go

There’s no need for anger, there’s no need for blame
There is nothing to prove, everything’s still the same
The table stands empty by the edge of the stream
Farewell Angelina, the sky’s changing colors, and I must leave

The jacks and the queens, they’ve forsaken the courtyard
Fifty-two gypsies now file past the guards
In the space where the deuce and the ace once ran wild
Farewell Angelina, the sky is folding, I’ll see you after a while

See the cross-eyed pirates sittin’, perched in the sun
Shooting tin cans with their sawed-off shotguns
And the corporals and neighbors, they cheer with each blast
Farewell Angelina, the sky is a tremblin’ and I must leave fast

King Kong, little elves, on the rooftops they dance
Valentino-type tangos while the makeup man’s hands
Shut the eyes of the dead, not to embarrass anyone
Farewell Angelina, the sky is embarrassed, and I must be gone

The camouflaged parrot he flutters from fear
When something he doesn’t know about suddenly appears
What cannot be imitated perfect must die
Farewell Angelina, the sky is flooding, I must go where it is dry

Machine guns are roaring, the puppets heave rocks
At misunderstood visions and the faces of clocks
Call me any name you like, I will never deny it
Farewell Angelina, the sky is erupting, I must go where it’s quiet

Bob Dylan, Farewell Angelina

Farmers studying their fields travel at a walk

The following is an essay by Wendell Berry, titled Farmland Without Farmers published recently in The Atlantic. The piece is extracted from the book Our Only World: Ten Essays, published by Counterpoint Press. I reproduce the text of the essay here in full, because, well, because it’s so damn good and important.

The landscapes of our country are now virtually deserted. In the vast, relatively flat acreage of the Midwest now given over exclusively to the production of corn and soybeans, the number of farmers is lower than it has ever been. I don’t know what the average number of acres per farmer now is, but I do know that you often can drive for hours through those corn-and-bean deserts without seeing a human being beyond the road ditches, or any green plant other than corn and soybeans. Any people you may see at work, if you see any at work anywhere, almost certainly will be inside the temperature-controlled cabs of large tractors, the connection between the human organism and the soil organism perfectly interrupted by the machine. Thus we have transposed our culture, our cultural goal, of sedentary, indoor work to the fields. Some of the “field work,” unsurprisingly, is now done by airplanes.

This contact, such as it is, between land and people is now brief and infrequent, occurring mainly at the times of planting and harvest. The speed and scale of this work have increased until it is impossible to give close attention to anything beyond the performance of the equipment. The condition of the crop of course is of concern and is observed, but not the condition of the land. And so the technological focus of industrial agriculture by which species diversity has been reduced to one or two crops is reducing human participation ever nearer to zero. Under the preponderant rule of “labor-saving,” the worker’s attention to the work place has been effectively nullified even when the worker is present. The “farming” of corn-and-bean farmers—and of others as fully industrialized—has been brought down from the complex arts of tending or husbanding the land to the application of purchased inputs according to the instructions conveyed by labels and operators’ manuals.

To make as much sense as I can of our predicament, I turn to Wes Jackson, founder of the Land Institute, in Salina, Kansas, and his perception that for any parcel of land in human use there is an “eyes-to-acres ratio” that is right and is necessary to save it from destruction. By “eyes” Wes means a competent watchfulness, aware of the nature and the history of the place, constantly present, always alert for signs of harm and signs of health. The necessary ratio of eyes to acres is not constant from one place to another, nor is it scientifically predictable or computable for any place, because from place to place there are too many natural and human variables. The need for the right eyes-to-acres ratio appears nonetheless to have the force of law.

We can suppose that the eyes-to-acres ratio is approximately correct when a place is thriving in human use and care. The sign of its thriving would be the evident good health and diversity, not just of its crops and livestock but also of its population of native and noncommercial creatures, including the community of creatures living in the soil. Equally indicative and necessary would be the signs of a thriving local and locally adapted human economy.

The great and characteristic problem of industrial agriculture is that it does not distinguish one place from another. In effect, it blinds its practitioners to where they are. It cannot, by definition, be adapted to local ecosystems, topographies, soils, economies, problems, and needs.

The sightlessness and thoughtlessness of the imposition of the corn-and-bean industry upon the sloping or rolling countryside hereabouts is made vividly objectionable to me by my memory of the remarkably careful farming that was commonly practiced in these central Kentucky counties in the 1940s and 50s—though, even then, amid much regardlessness and damage. The best farming here was highly diversified in both plants and animals. Its basis was understood to be grass and grazing animals; cattle, sheep, hogs, and, during the 40s, the workstock, all were pastured. Grain crops typically were raised to be fed; the farmers would say, “The grain raised here must walk off.” And so in any year only a small fraction of the land would be plowed. The commercial economy of the farms was augmented and supported by the elaborate subsistence economies of the households. “I may be sold out or run out,” the farmers would say, “but I’ll not be starved out.”

My brother recently reminded me how carefully our father thought about the nature of our home countryside. He had witnessed the ultimate futility—the high costs to both farmer and farm—of raising corn for cash during the hard times of the 1920s and 30s. He concluded, rightly, that the crop that could be raised most profitably in the long run was grass. That was because we did not have large acreages that could safely be used for growing grain, but our land was aboundingly productive of grass, which moreover it produced more cheaply than any other crop. And the grass sod, which was perennial, covered and preserved the soil the year round.

A further indication of the quality of the farming here in the 40s and 50s is that the Soil Conservation Service was more successful during those years than it would or could be again in the promotion of plowing and terracing on the contour to control soil erosion. Those measures at that time were permitted by the right scale of the farming and of the equipment then in use. Anybody familiar with topographic maps will know that contour lines remain strictly horizontal over the irregularities of the land’s surfaces; crop rows cannot be regularly spaced. This variability presents no significant problem to a farmer using one- or two-row equipment in relatively small lands or fields. And so for a while contour farming became an established practice on many farms, and to good effect. It was defeated primarily by the enlargement of fields and the introduction of larger equipment. Eventually, many farmers simply ignored their terraces, plowing over them, the planted rows sometimes running straight downhill. Earlier, a good many farmers had taken readily to the idea of soil conservation. A farmer in a neighboring county said, “I want the water to walk off my land, not run.” But beyond a certain scale, the farming begins to conform to the demands of the machines, not to the nature of the land.

Within three paragraphs I have twice quoted farmers who used “walk” as an approving figure of speech: Grain leaving a farm hereabouts should walk off; and the rainwater fallen upon a farm should walk, not run. This is not merely a coincidence. The gait most congenial to agrarian thought and sensibility is walking. It is the gait best suited to paying attention, most conservative of land and equipment, and most permissive of stopping to look or think. Machines, companies, and politicians “run.” Farmers studying their fields travel at a walk.

Farms that are highly diversified and rightly scaled tend, by their character and structure, toward conservation of the land, the human community, and the local economy. Such farms are both work places and homes to the families who inhabit them and who are intimately involved in the daily life of land and household. Without such involvement, farmers cease to be country people and become in effect city people, industrial workers and consumers, living in the country.

* * *

I have spoken so far of the decline of country work, but the decline of country pleasures is at least equally significant. If the people who live and work in the country don’t also enjoy the country, a valuable and necessary part of life is missing. And for families on farms of a size permitting them to be intimately lived on and from, the economic life of the place is itself the primary country pleasure. As one would expect, not every day or every job can be a pleasure, but for farmers who love their livestock there is pleasure in watching the animals graze and in winter feeding. There is pleasure in the work of maintenance, the redemption of things worn or broken, that must go on almost continuously. There is pleasure in the growing, preserving, cooking, and eating of the good food that the family’s own land provides. But around this core of the life and work of the farm are clustered other pleasures, in their way also life-sustaining, and most of which are cheap or free.

I live in a country that would be accurately described as small-featured. There are no monumental land forms, no peaks or cliffs or high waterfalls, no wide or distant vistas. Though it is by nature a land of considerable beauty, there is little here that would attract vacationing wilderness lovers. It is blessed by a shortage of picturesque scenery and mineable minerals. The topography, except in the valley bottoms, is rolling or sloping. Along the sides of the valleys, the slopes are steep. It is divided by many hollows and streams, and it has always been at least partially wooded.

Because of the brokenness and diversity of the landscape, there was never until lately a clean separation here between the pursuits of farming and those of hunting and gathering. On many farms the agricultural income, including the homegrown and homemade subsistence of the households, would be supplemented by hunting or fishing or trapping or gathering provender from the woods and berry patches—perhaps by all of these. And beyond their economic contribution, these activities were forms of pleasure. Many farmers kept hounds or bird dogs. The gear and skills of hunting and fishing belonged to ordinary daily and seasonal life. More ordinary was the walking (or riding or driving) and looking that kept people aware of the condition of the ground, the crops, the pastures, and the livestock, of the state of things in the house yard and the garden, in the woods, and along the sides of the streams.

My own community, centered upon the small village of Port Royal, is along the Kentucky River and in the watersheds of local tributaries. Its old life, before the industrialization of much of the farmland and the urbanization of the people, was under the influence of the river, as other country communities of that time were under the influence of the railroads. In the neighborhood of Port Royal practically every man and boy, some girls and women too, fished from time to time in the Kentucky River. Some of the men fished “all the time” or “way too much.” Until about a generation ago, there was some commercial fishing. And I can remember when hardly a summer day would pass when from the house where eventually I would live you could not hear the shouts of boys swimming in the river, often flying out into the water from the end of a swinging rope. I remember when I was one of them. My mother, whose native place this was, loved her girlhood memories of swimming parties and picnics at the river. In hot weather she and her friends would walk the mile from Port Royal down to the river for a cooling swim, and then would make the hot walk back up the hill to town.

Now the last of the habituated fishermen of the local waters are now dead. They have been replaced by fishermen using expensive “bassboats,” almost as fast as automobiles, whose sport is less describable as fishing than as using equipment. In the last year only one man, comparatively a newcomer, has come to the old landing where I live to fish with trotlines—and, because of the lack of competition, he has caught several outsize catfish. Some local people, and a good many outsiders, hunt turkeys and deer. There is still a fair amount of squirrel hunting. The bobwhite, the legendary gamebird of this region, is almost extinct here, and the bird hunters with them. A rare few still hunt with hounds.

Most remarkable is the disappearance of nearly all children and teenagers, from the countryside, and in general from the out-of-doors. The technologies of large-scale industrial agriculture are too complicated and too dangerous to allow the participation of children. For most families around here, the time is long gone when children learned to do farmwork by playing at it, and then taking part in it, in the company of their parents. It seems that most children now don’t play much in their house yards, let alone in the woods and along the creeks. Many now descend from their school buses at the ends of lanes and driveways to be carried the rest of the way to their houses in parental automobiles. Most teenagers apparently divide their out-of-school time between indoor entertainment and travel in motor vehicles. The big boys no longer fish or swim or hunt or camp out. Or work. The town boys, who used to hire themselves out for seasonal or part-time work on the farms, no longer find such work available, or they don’t wish to do the work that is available.

Local people who regularly hunted or fished or foraged or walked or played in the local countryside served the local economy and stewardship as inspectors, rememberers, and storytellers. They gave their own kind of service to the eyes-to-acres ratio. Now most of those people are gone or absent, along with most of the farming people who used to be at work here.

With them have gone the local stories and songs. When people begin to replace stories from local memory with stories from television screens, another vital part of life is lost. I have my own memories of the survival in a small rural community of its own stories. By telling and retelling those stories, people told themselves who they were, where they were, and what they had done. They thus maintained in ordinary conversation their own living history. And I have from my neighbor, John Harrod, a thorough student of Kentucky’s traditional fiddle music, his testimony that every rural community once heard, sang, and danced to at least a few tunes that were uniquely its own. What is the economic value of stories and songs? What is the economic value of the lived and living life of a community? My argument here is directed by my belief that the art and the life of settled rural communities are critical to our life-supporting economy. But their value is incalculable. It can only be acknowledged and respected, and our present economy is incapable, and cannot on its own terms be made capable, of such acknowledgement and respect.

Meanwhile, the farmlands and woodlands of this neighborhood are being hurt worse and faster by bad farming and bad logging than at any other time in my memory. The signs of this abuse are often visible even from the roads, but nobody is looking. Or to people who are looking, but seeing from no perspective of memory or knowledge, the country simply looks “normal.” Outsiders who come visiting almost always speak of it as “beautiful.” But along this river, the Kentucky, which I have known all my life, and have lived beside for half a century, there is a large and regrettable recent change, clearly apparent to me, and to me indicative of changes in water quality, but perfectly invisible to nearly everybody else.

* * *

I don’t remember what year it was when I first noticed the disappearance of the native black willows from the low-water line of this river. Their absence was sufficiently noticeable, for the willows were both visually prominent and vital to the good health of the river. Wherever the banks were broken by “slips” or the uprooting of large trees, and so exposed to sunlight, the willows would come in quickly to stabilize the banks. Their bushy growth and pretty foliage gave the shores of the river a distinctive grace, now gone and much missed by the few who remember. Like most people, I don’t welcome bad news, and so I said to myself that perhaps the willows were absent only from the stretch of the river that I see from my house and work places. But in 2002 for the first time in many years I had the use of a motor boat, and I examined carefully the shores of the twenty-seven-mile pool between locks one and two. I saw a few old willows at the tops of the high banks, but none at or near the low-water line, and no young ones anywhere.

The willows still live as usual along other streams in the area, and they thrive along the shore of the Ohio River just above the mouth of the Kentucky at Carrollton. The necessary conclusion is that their absence from the Kentucky River must be attributable to something seriously wrong with the water. And so, since 2002, I have asked everybody I met who might be supposed to know: “Why have the black willows disappeared from the Kentucky River?” I have put this question to conservationists, to conservation organizations specifically concerned with the Kentucky River, to water-quality officials and to university biologists. And I have found nobody who could tell me why. Except for a few old fishermen, I have found nobody who knew they were gone.

This may seem astonishing. At least, for a while, it astonished me. I thought that in a state in which water pollution is a permanent issue, people interested in water quality surely would be alert to the disappearance of a prominent member of the riparian community of a major river. But finally I saw that such ignorance is more understandable than I had thought. A generation or so ago, when fishing and the condition of the river were primary topics of conversation in Port Royal, the disappearance of the willows certainly would have been noticed. Fishermen used to tie their trotlines to the willows.

That time is past, and I was seeking local knowledge from conservationists and experts and expert conservationists. But most conservationists, like most people now, are city people. They “escape” their urban circumstances and preoccupations by going on vacations. They thus go into the countryside only occasionally, and their vacations are unlikely to take them into the economic landscapes. They want to go to parks, wilderness areas, or other famous “destinations.” Government and university scientists often have economic concerns or responsibilities, and some of them do venture into farmland or working forests or onto streams and rivers that are not “wild.” But it seems they are not likely to have a particular or personal or long-term interest in such places, or to go back to them repeatedly and often over a long time, or to maintain an economic or recreational connection to them. Such scientists affect the eyes-to-acres ratio probably less than the industrial farmers.

Among the many conservationists I have encountered in my home state, the most competent witness by far is Barth Johnson, a retired game warden who is a dedicated trapper, hunter, and fisherman, as he has been all his life. Barth has devoted much of his life to conservation. Like most conservationists, he is informed about issues and problems. Unlike most, he is exceptionally alert to what is happening in the actual countryside that needs to be conserved. This is because he is connected to the fields and woods and waters he knows by bonds of economy and pleasure, both at once. Moreover, he has lived for thirty years in the same place at the lower end of the Licking River. This greatly increases the value of his knowledge, for he can speak of changes over time. People who stay put and remain attentive know that the countryside changes, as it must, and for better or worse.

He tells a story about Harris Creek, a small stream along which he had trapped for many years. It was richly productive, and Barth was careful never to ask too much of it. But in 2007, confident that it would be as it always had been, he went there with his traps and discovered that the stream was dead. He could not find a live minnow or crawfish. There were no animal tracks. So far as he could tell, there could be only one reason for this: In the spring of that year, the bottomland along the creek had been herbicided to kill the grass in preparation for a seeding of alfalfa. In 2008, the stream was still dead. In 2009, there was “a little coon activity.” Finally, in 2013, the stream was “close to normal.”

I have also learned from Barth that upstream as far as he has looked, to a point two and a half miles above the small town of Boston, the black willows are gone from the Licking River. And in October 2013, he wrote me that the river had turned a brownish “brine” color that he had never seen before.

What happened to the willows? Two young biologists at Northern Kentucky University are now at work on the question, and perhaps they will find the answer. But other scientists have led me to consider the possibility that such questions will not be answered. It may be extremely difficult or impossible to attach a specific effect to a specific cause in a large volume of flowing water.

What killed Harris Creek? Barth’s evidence is “anecdotal,” without scientifically respectable proof. I have read scientific papers establishing that the herbicide glyphosate and its “degradation products” are present in high concentrations in some Mississippi River tributaries, but the papers say nothing about the effects. I have called up scientists working on water quality, including one of the authors of one of the papers on glyphosate. What about the effects? Good question. Nobody knows the answer. It seems that the research projects and the researchers are widely scattered, making such work somewhat incoherent. And besides, there is always the difficulty of pinning a specific cause to a specific effect. To two of these completely friendly and obliging people I told Barth’s story of Harris Creek: Does that surprise you? One said it did not surprise him. The other said it was possible but unlikely that the stream was killed by an herbicide. Was an insecticide also involved?

What caused the strange discoloration of the Licking River? Since the discoloration was visible until obscured by mud in the water when the river rose, I suppose that, if it happens again, the odd color could be traced upstream to a source. Will somebody do that? I don’t know. Is any scientist from any official body monitoring the chemical runoff from croplands and other likely sources? I have been asking that question too, and so far I have asked nobody who could answer.

In my search for answers, it may be that I have been making a characteristic modern mistake of relying on experts, which has revealed a characteristic modern failure: Experts often don’t know and sometimes can never know. Beneficiaries of higher education, of whom I am one, often give too much credit to credentials.

Confronting industrial agriculture, we are requiring ourselves to substitute science for citizenship, community membership, and land stewardship. But science fails at all of these. Science as it now predominantly is, by definition and on its own terms, does not make itself accountable for unintended effects. The intended effect of chemical nitrogen fertilizer, for example, is to grow corn, whereas its known effect on the Mississippi River and the Gulf of Mexico is a catastrophic accident. Moreover, science of this kind is invariably limited and controlled by the corporations that pay for it.

We have an ancient and long-enduring cultural imperative of neighborly love and work. This becomes ever more important as hardly imaginable suffering is imposed upon all creatures by industrial tools and industrial weapons. If we are to continue, in our only world, with any hope of thriving in it, we will have to expect neighborly behavior of sciences, of industries, and of governments, just as we expect it of our citizens in their neighborhoods.