About Jim Bouldin

For the last 20 years I've been a self-employed research ecologist working on contract, purposely avoided academia except for ongoing work to publish a few things important to me. Education: B.S., Wildlife Management, Ohio State, 1982 PhD, Plant Biology, UC Davis, 1999 Post-doc, US Forest Service, 1999-2001

Article retraction attempts at ESA journals

Over the last two years I’ve been attempting to get two papers retracted, both published in separate journals run by the Ecological Society of America (ESA), the most prominent professional society of ecologists in the world. The ESA has five journals, all published by Wiley, one of the very largest science publishers along with Springer, Elsevier and a couple of others.

Around 2010, I became aware of the work of William Baker, professor at the University of Wyoming, via an email interaction with one of his (then) grad students, who was aware of a paper I had published in 2008 and contacted me with some technical questions. The topic area involves the estimation of historic American forest or woodland conditions using the tree data collected by land surveyors in the 1800s. Most of these data were collected by the US General Land Office (GLO), and thus the survey is usually referred to as the GLO, or rectangular, survey. This survey was cadastral (land parcel delineation) and massive in extent, stretching from Ohio and Florida in the east, to the Pacific states and Alaska in the west, excepting Kentucky, Tennessee, Georgia and Texas.

Whenever GLO surveyors were working in areas of woodland or forest vegetation, they were required to record the taxon and diameter of two to four trees close to each defined survey point, and the angle and distance from those points to the trees. The survey points form the corners of square land parcels of 160 acres which were the basis of all subsequent land ownership transactions. The trees are known as “bearing trees”, and were marked as such on blazes on the trees. The survey point locations, called “corners”, were marked with either posts, rock piles, a set of four trenches, or some combination thereof.

Beginning in 1921, ecologists began to use the bearing tree data and other recorded vegetation data to estimate vegetation conditions at the time of the survey, which in most cases preceded settlement and thus the major land cover changes that came with it. Just how to make these estimates, however, has been a recurrent methodological issue in the many dozens of papers that have now appeared, the most difficult problem of all being the estimation of tree density from the measured distances. Nevertheless, most of the principle issues and potential biases in so doing had been identified by the 1970s, especially the 1950s, and most authors since have avoided dubious or wrong density claims, or just avoided density estimation altogether, concentrating instead on species composition or diameter distributions, at various spatial scales.

Jump to 2011. Into the scene step Baker and the former grad student referred to, Mark Williams, with a paper published in Ecological Monographs in which they introduce a new and entirely different type of method that they claim solves the longstanding problem of accurate density estimation from bearing tree data. Ecological Monographs is the most prestigious of the ESA journals, dating to the early 1930s and ranking near the top of all ecology journal rankings, with a very strong reputation. It is designed for long and comprehensive papers that either authoritatively review a topic, present a large mass of new data and analysis from a long term effort, or some combination thereof.

Occupied by other things at the time, I was unaware of Williams and Baker’s 2011 paper for several years, as well as two or three others that Baker was sole or co-author on. I didn’t look closely at the 2011 Monographs paper until maybe five years ago. When I did, I was amazed–but not in the way that monographic work is supposed to impress you.

To begin with, it wasn’t monographic in nature at all–it was instead just a presentation of a new density estimator that the authors invented, followed by a comparison against various other existing estimators. Assuming valid methods, this would comprise a standard type of paper, not a monograph. However, the methods are not valid, not even close. They are in fact horrendous–mistaken on conceptual, biological, and mathematical grounds, showing repeated misunderstandings of the existing literature, and of the relevant mathematics, and even fundamental principles of analysis, requiring impractical field methods, and leading to completely unreliable conclusions. To explain all this would require far too long and technical of an essay. The point of this post is my efforts to get that, and a later paper by Baker, retracted from ESA journals. Which goes like this:

About two years ago I contacted the Editor in Chief of Ecosphere, Debra Peters, regarding my concerns with a 2014 paper of Baker’s. At the same time, I was carefully reading the 2011 Monographs paper, because the former paper used the methods advocated in the latter, among other serious problems. The two are among the worst articles I’ve ever read, on any ecological topic, and far and away the worst on this topic–and I’ve read a lot of them. But this would not be apparent to those without a strong understanding of both the GLO survey and the math behind distance-based density estimators. I and soon realized that both articles merited retraction. The leading global scientific integrity organization is COPE, the Council on Publication Ethics, and their clearly stated criterion for article retraction, is unreliability of conclusions. Evidence of fraud or deceit are not required: unreliable conclusions are sufficient.

I received an encouraging response from Debra Peters, who offered to work with me to get retractions, if merited. She was aware that Baker had already been the target of other published pieces criticizing various of his papers, but which did not call for retractions. She wanted to know if I had additional arguments that had not been presented by these others, and I assured her that I did, as well as much additional data that added strength to certain points made by others. I then sent her a brief synopsis of the major problems in an attempt to demonstrate this and also mentioned the problems with the 2011 paper.

However, I soon found that there was a major impediment to the retraction process. Of course, you don’t just write a letter to an Editor-in-Chief and demand a retraction; you have to formally lay out in detail all the issues, detailing the exact reasons, which is then peer reviewed, and constitutes a paper in itself. That part goes without saying. I also knew that ESA had high per-article charges for their papers, but I simply assumed that these would be waived for any article successfully demonstrating the need for retraction. These charges are not minor–roughly $2000 depending on situation–and it became clear from the correspondence that ESA intended to charge me roughly $4000 for the two calls for retraction.

Fairly stunned, I responded, straight up, that not only did I not even have the required $4000, but that, even if I did have it, that I would not pay it, nor in fact would I pay any amount at all. I bluntly told them that any such fee was unjust, that it was their mistake to publish these two articles in the first place, resulting directly from a lax peer review process that would allow any such shoddy pieces through the gate. One can legitimately ask whether it’s justified to even charge authors of new papers such high fees, much less those attempting to point out a problem of ESA’s doing. I further said that the amount of unpaid work involved in compiling and presenting the necessary arguments (massively time consuming–can’t even begin to describe it) was far more than enough of a contribution. These things were also stated in a separate letter to the Executive Director of ESA, to whom I had finally taken the completely stalled matter.

Somewhere in this timeline, Peters stepped down from her Ecosphere EiC position, which meant I would now have to start over with somebody new, and possibly unfamiliar with the situation. Meanwhile, no comment or reply of any kind had been received from the Editor-in-Chief of Ecological Monographs, Jean-Philippe Lessard, nor has any been since, which I interpret to mean that he has no intention of dealing with the matter.

The only offer made in the various discussions was that I could join ESA at their standard membership rate and then make use of a policy which allows ESA members to publish one article per year, in Ecological Applications, free of charge. Although I like that journal, this proposal I also rejected on the same basis, i.e. principle. First of all, neither I nor anyone else should have to join ESA to criticize the articles they publish; secondly the two offending articles were published in other ESA journals, not in Ecological Applications, and therefore pointed criticisms of them should obviously appear in the originating journals; thirdly, if I were to join ESA I should have the right to use the free article benefit for a paper of my origination and interest, like any other member does, not one pointing out somebody else’s incompetence.

In my letters to the two EiC’s, the ESA publications office, and the ESA Executive Director, I then specifically asked for an exemption to the article fees, which I argued, they could and should easily make an exception for, and which should be ESA policy in the first place anyway. In doing so, it became clear that they in fact had no policy at all on this issue, and were apparently going to charge high article fees of anybody who wanted to call for any article retraction.

On top of all this, and after long delays (months) in waiting for various responses, the ESA Executive Director stated that I had not presented enough information in my correspondence with Debra Peters at Ecosphere to justify a retraction. What??!! I hadn’t submitted any manuscript detailing the reasons for either call for retraction. The only explanatory piece of any kind that I had sent was the brief summary of problems that I mentioned above, so that Ecosphere EiC Debra Peters would have some idea of what my arguments were. In no way was that piece ever intended to be, or presented as, a full description of the two papers’ problems, which are quite technical and involved, nor was it in any kind of manuscript form whatsoever. Either the Executive Director didn’t understand some basic facts on the issue, or else she was blowing smoke, grabbing for excuses to dismiss the matter, I don’t know.

I could not understand why the ESA wouldn’t just make a simple exemption of fees in this case, given what I had told them and the existing criticisms against multiple Baker papers. I was puzzled by this until I realized–it’s probably not their decision to make. Wiley is the actual publisher, not ESA, and Wiley like all big publishers is in the game for profit–they very likely set all article charges and everything else financial. This would include having no policy whatsoever in place for potential authors under financial hardship, a fact which is probably best understood as the result of entities living in a world in which, to them, everyone either has a salaried academic position, a large grant, or both. The idea that someone outside the standard academic system would raise strong arguments, and not have that kind of money to throw down, probably does not fall within their awareness zone. And very possibly, the idea of having to retract any paper, much less one exhibiting evidence of possible fraud, or intentional negligence, which is a definite possibility in Baker’s 2014 paper, given that he used fraudulent historical data–well that probably doesn’t sit well at all with the desired image.

Thus completely stymied, I then thought that I might write up my arguments and put them up on the EcoEvoRxiv preprint server in the mean time, so people could at least be aware of the issues. So I then went to the ESA publications website to see if there was a policy on preprints, where I read that preprints can later be submitted for ESA publication, but only if they retain their exact, original form, without any changes. Now why should that be the policy? What if you come up with better arguments, or even just a better presentation of the original arguments? You mean to tell me that you can’t change your preprint so as to present the arguments better…or better arguments?

Nevertheless, given ESA’s position, that option is one of the few that I have. Others are (1) to present the arguments here, or (2) to submit a manuscript to a non-ESA journal discussing these problems, but framed more generally somehow, and thus giving up on retraction altogether. There is a third strategy in mind, but in battle you never signal possible intentions to the opposition. And a battle is what this now is, without a doubt.

On top of all this, it was actually never made clear in the first place, just how the publications involved in a retraction would work. Would my pieces in fact be published as full articles? Would the original articles also remain visible so that people could read and compare them against my arguments? What about the responses of the authors to my arguments, would they appear? Or, were the journal Editors-in-Chief just going to make a decision, based on my evidence but without presenting it publicly–just suddenly placing a cryptic notice stating little more than that the papers had been retracted due to “editorial concerns”. None of these questions were made clear at the ESA website, nor the journal websites, nor in my interactions, and the entire appearance of the whole thing is that ESA just doesn’t really have any retraction policy at all. Worse still, high article fees put in place so as to actively discourage calls for retraction, under orders from Wiley, is also a possibility.

So, these are some of the fun and games you might expect when you try to get even egregiously wrong papers retracted from prominent scientific journals, and how I’ve been relaxing since posting last.

Onward

Well, a long delay in posting obviously; many things happened that shouldn’t have, interspersed with things that didn’t happen that should have. I won’t get into that now, though some of it might come out in posts I have in mind.

One change of note is that I’m going to try to focus more on the general topic of bad science, including cases of actual or possible fraud, and with an emphasis on ecology. That’s not a huge shift, given that I already have numerous posts on the general issue. I think I’m also ready to tell some stories along those lines. And I do have some.

And with that brief intro, a new post will be up soon.

About to re-start

Hi Folks,

I’m about to re-start the blog, for those who may be interested. A few years ago I stopped rather suddenly, due to a string of incidents and other reasons. But I actually had a lot more to say at the time, and there have been a number of significant developments since then. Most likely I’m going to start with some relatively short pieces, due to time considerations, but this could quickly transform into some extensive science pieces on ecological topics that I’m dealing with now. These deal primarily, but not exclusively, with incompetence, group-think and/or fraud in science. Stay tuned.

You got a spare icebreaker you won’t need this winter?

Well it’s been a while (i.e., here and here) since I checked in on Inland Seas, the Quarterly Journal of the National Museum of the Great Lakes, an interesting cross between an academic journal and a popular magazine dating to 1945.  The journal and museum are the centerpieces of the Great Lakes Historical Society. The museum is a restored lake freighter (shown below) built in 1911 and now docked in the Maumee River in Toledo, Ohio.

The latest (Vol. 75, No. 2) issue’s first article is titled Early Maps of the Great Lakes and the First Boat Trip Across Lake Erie in 1669–a topic sure to get my attention. It turns out that no, Robert LaSalle and his 34 crewmen were not in fact the first to traverse the length of Lakes Erie, St. Clair and Huron in their famous sailing ship, the Griffon.  Rather, it was two French Jesuits in canoes ten years earlier who did so, and who even made a pretty accurate map of it, given the circumstances.  A major part of the Lake Erie traverse was done in March/April, 1670, in which they dealt  with lots of ice, high winds, swampings, hostile natives and other perils inherent in such an escapade. Great stuff there.

But my favorite part of this journal may well be the “Great Lakes News” section, wherein various events from the previous three months are briefly mentioned.  Many of these are entirely mundane.  For example, we read that on January 6, “The Walter J. McCarthy Jr. departed Duluth after spending nearly two days loading frozen [taconite] pellets” and that on January 1 the last downbound commercial vessel of the shipping season cleared the Welland Canal.

But there’s usually also some much more interesting stuff, and so it is this time.

We learn for example that on January 6, the crew of the US Coast Guard’s Mackinaw happened to witness a stray dog go through the ice in the St. Mary’s River (connecting Lakes Superior and Huron) and to see it then struggle onto an island therein. Fully 20 crewmen then went in search, but finding no dog “set a campfire on shore and left a bowl of macaroni”, apparently considerate of the fact that this dog might be a vegetarian.  A couple of days later they found the bowl empty, and the dog nearby, and then motored on over to Cheboygan where they delivered the dog to its (no-doubt surprised and thankful) owners.

But it gets better.On March 12, the US Coast Guard and several other agencies rescued 46 ice fishermen from an ice floe that broke free near Catawba Island, in SW Lake Erie.  Another 100 or so escaped either by swimming or running across existing ice bridges before the full break occurred.  The article doesn’t say it, but undoubtedly there are also a bunch more snowmobiles, ATVs and pickup trucks now resting on the bottom of western Lake Erie–the Coast Guard saves persons first, property second, if at all.  Try swimming for land in the open (ice) waters of Lake Erie in early March after you’ve had maybe a six pack (or more) and tell me how it goes. This type of event happens more frequently than you might imagine–there are a lot of rusting snowmobiles at the bottom of Lake Erie.

The Canadian CG has different issues: “…the Canadian Coast Guard’s Hero class of midshore patrol boats are suffering from extreme rolling in even moderate seas. The rolling is so severe that crew members stuff jackets under the edge of their bunks to raise them so they will not be flung to the deck while asleep. Seasickness affects many crew members…”.

Not everything of interest involves the two Coast Guards however.

For example, on February 7, in Superior WI, we read that “the horn on the docked American Spirit stuck in the ‘on’ position, disturbing the neighbors at the nearby McDonald’s and local residences.  Some hours later the horn was shut off.  The sound could be heard for miles, as it should”.  Is anything much worse than going to the local McDonald’s for morning coffee and cholesterol and getting summarily blasted out of there by a ship’s horn? I mean, other than having to listen to the music that one would otherwise be subjected to…

If you’ve been thinking (naively!) that ship fires would probably be among the easier class of possible fires to deal with, at least from a total water availability standpoint…well not so fast.  It seems that “the St. Clair caught fire at its lay-up dock in Toledo…Fire crews from Toledo had a hard time getting hydrants on the dock to work… and had to just play water on the ship and let the fire burn out…”.  Success in this case was measured in terms of  preventing other nearby ships at the port from also catching fire.

There are non-boat-related news items as well.  If you have some spare change, Ohio Governor Mike DeWine is seeking some “real money” to help combat Microcystis algal growth in Lake Erie. Dig deep now, because there’s nothing micro about the amount desired–about $1 billion (DeWine was heard to say “fight green with green” after the announcement).  Farmers in the Maumee River basin (responsible for the phosphorous loading that contributes to the algal blooms) were in turn heard to respond “I’ll be glad to show you what some green corn and soybeans look like Mike”. And so on.

Also noted is that the Canadian government is actively seeking a temporary, light duty icebreaker as a replacement for those that will be laid up for repairs this winter.  If you would happen to have a spare icebreaker that you don’t think you’ll need this winter, contact them.

Canada also takes this opportunity for a friendly seasonal reminder to always keep your stick on the ice, icebreaker or no.

Latest book review, boom, done

I’ve been remiss in the book review department lately, where “lately” starts at roughly day one of this blog. This post changes all that in a hurry.

Recently I noticed the nearly 1400 page CRC Handbook of Aqueous Solubility Data (2010) sitting nearby, reminding me that in nearly nine years I’d not in fact read any of it. So I looked at the first couple of pages, and well, like old Mission Impossible episodes, once you engage, you’re in for the duration. Bottom line: this is a great read, and as a paperweight or ballast, even greater. This tome takes no prisoners: aqueous solubilities for 4661 organic chemicals, sans break. All organic chemistry, all the time; in it to win it.

Organizationally, the book is quite something else again. The first section is engagingly titled “Solubility Data”, comprised of 4661 entries arranged in–you guessed it–numerical order. Entry number one gets right after it, by way of bromodichloromethane (or alternatively, dichlorobromomethane–experts pressed for time just say “CHBrCl2”). Within it, as with each such entry, is found all kinds of useful data, including molecular weights, boiling and melting points, and of course, solubilities (both molar and by weight!). All necessary references are also included, for the skeptical–these authors aren’t hiding anything here. A “Comments” field contains useful ancillary information, such as “recrystallized”. There is also a complex “TPEAA” evaluation, avowedly for experts only.

The next entry, (chlorodibromomethane), is fairly similar to the first. In fact, all 4661 entries are markedly similar. Exactly the same actually, just slightly different. There is no guesswork as to just what the point of this book is. The Solubility Data section occupies fully the first 700 pages of the book…as well as the last 700 pages. If you want wild topical variations from one entry to the next, try an encyclopedia, newspaper or Science magazine: this book is for chemically focused readers.

One note is that the sheer volume of material can be an issue. But this is not unexpected–people do vary in the specific organic chemicals that will hold their attention indefinitely, after all. But then–just boom–you hit an entry that snaps you right out of it. For me entry 151 provides a good example: nitroglycerin (C3H5N3O9). Talk about boom! Is it really that hard to imagine the history involved in determining the boiling point of that sucker?! Research dollars available! And that could give you a heart attack just thinking about. However, to keep the book from exploding well beyond its 1400 pages (thus beyond a leisurely reading), the authors wisely leave such details to the reader.

Or take something like entry 4289, which most know simply as
(Metahanesulfonamide, N[1′-[2-(2,1,3-benzoxadiazol-5-yl)ethyl]-3,4-dihydro-4-oxospiro[2H-1-benzopyran-2,4′-piperidin]-6-yl]-). Catchy name, sure, but even the casual reader will note that chemical 4289 has a mysterious “intrinsic” solubility listed. What the hell’s that supposed to mean, one briefly wonders aloud, before moving on to entry 4290, which happens to be one colchicine. Now colchicine is, genetically speaking, a very important chemical, capable of instantly doubling an organism’s chromosome number. Talk about explosive, this is massive, all-at-once mutation, like something out of a B-grade 1950s-era film with giant ants that take over Guam or whatever–but without any H-bombs, atolls or any of that whole scene. Oh so you were thinking that everything was all “better living through chemistry” were you? Well think again hombre. Not for the squeamish.

I would be remiss (per request of the authors) if I did not note that this book has vey few weaknesses. The only one I could find was for entry #22, urea (CH4N2O), for which no boiling point is given. But think it through again: is ignorance of the boiling point of urea really a weakness…or rather an open research opportunity? I mean, who doesn’t have a stove after all? Speaking of urea and stoves, oh man, there was that time a buddy and I were climbing Mt. Rainier, and as usual it was damn cold, so we had a bottle to use to avoid those unwanted “wee-hour” excursions outside the tent…but it turns out that at breakfast we mixed it up with the water bottle…well let me tell you, that will impart flavor to the old oatmeal!

We can get into those details more later but we’re out of time for today. In summary, this book has been unduly neglected, given its importance to industrial society and the career advancement of the authors. Read it for Christ’s sake–it’s been nine years already and aqueous solubility issues really are out there.

Stay tuned for the next in the series, where we take a closer look at Dynamic elasticities and breaking points of commercial hardwoods, tentatively planned for the summer of 2024. This post is hereby concluded and the decision is final. Special mention to all those who participated with minimal regret.

Severe analytical problems in dendroclimatology, part fifteen

I’m going to give this topic another explanatory shot with some different graphics, because many still don’t grasp the serious problems inherent in trying to signal and noise from tree ring size. The most advanced method for attempting this is called Regional Curve Standardization, or RCS, in which ring size is averaged over a set of sampled trees, according to the rings’ biological age (i.e. ring number, counting from tree center), and then dividing each individual series by this average. I include five time series graphs, successively containing more information, to try to illustrate the problem. I don’t know that I can make it any clearer than this.

First, shown below are the hypothetical series of 11 trees sampled at a single sampling location.

Each black line shows the annual ring area progression for each of 11 trees having origin dates spaced exactly 10 years apart (the bold line is just the oldest tree of the group). By using ring area as the metric we automatically remove part of the non-climatic trend, which is the purely geometric (inverse quadratic) effect from each series. Any remaining variation is then entirely biological and it exhibits a very standard tree growth pattern, one in which growth rate increases to a maximum value reached relatively early life (here, around age 80 or so) and then declines more slowly toward a stable asymptote, which I fix at 1.0. Each tree’s trajectory occurs in a constant climate over the 300-400 year period measured.

The next figure adds two components:

First, the blue line represents a constantly increasing climatic parameter over time, say temperature, expressed as a ratio of its effect on ring size at year 0. Thus, at year 400, the cumulative climatic effect on ring area, regardless of biological age, is exactly 3-fold of its year zero value (scale at right). The second addition is the series of red lines, which simply represent those same 11 trees’ growth trajectories growing under this climate trend. The climatic effect on growth is a super simple linear ramp in all cases–I am not invoking any kind of problematic, complex growth response (e.g. “divergence”), or any other complication. Thus, by definition, if we divide the two corresponding ring series for each tree, we get exactly the blue line, in all cases.

In the third figure:

I add a green line–this is the estimated RCS curve, computed the standard way (by aligning each tree according to its biological age and then averaging the ring sizes over all trees). This RCS curve is thus the estimated non-climatic ring size variation, which we accordingly remove from each tree by dividing the red growth series by it. Finally, we average the resulting 11 index series, over each of the 400 years, giving the stated goal: the estimated climatic time series.

It is at first glance entirely clear that the green RCS curve does not even come close to matching any of the black curves representing the true non-climatic variation…which it must. According to standard dendroclimatological practice we would now divide the 11 red curves by this green RCS curve–which is thereby guaranteed not to return the true climatic signal. So what will it return?

It returns the orange line shown above. No that’s not a mistake: it will return an estimated climatic trend of zero.

And this is the entire point–the supposedly most advanced tree ring detrending method is fully incapable of returning the real climatic trend when one exists. Note that I’m keeping everything very simple here–this result does not depend on: (1) either the direction or magnitude of the true trend, or (2) the magnitude, or shape, of the non-climatic trend in the sampled trees (including no such whatsoever). That is, this type or magnitude of result is not specific to the situation I set up. The problem can be reduced, but never eliminated, by increasing the variance in tree ages in the sample. But since standard field sampling practice is to sample the oldest possible trees at a site, this is very rare, a fact which the data of the International Tree Ring Database (ITRDB) shows clearly–which is ironic given that Keith Briffa and Ed Cook mentioned the importance of exactly this issue in a white paper available at the ITRDB site.

Lastly, suppose now that the last usable year for all ring series occurred a few decades ago. This will occur, for example, due to many ITRDB field samples being collected decades ago now, or for any perceived problems in the climate-to-ring response calibration function, which is must be stable and dependable (notably, the “divergence” effect, in which linear relationships between climate and ring size break down, badly). What will be the result of eliminating, say, the last five decades of data, and replace them with instrumental data? Well, you will then get exactly this:

Look familiar? Does that look like anything remotely approaching success to you? Again, I have not even broached other possibly confounding problems, such as co-varying growth determinants (e.g. increasing CO2- or N-fertilization, changing soil moistures, or inter-tree competition), nor non-linear responses in the calibration function, nor any of the thorny issues in large-scale sampling strategies, reconstructions and their corresponding data analysis methods. Those things would all exacerbate the problem, not improve it. It’s a total analytical mess–beginning and end of story.

I can’t make it any clearer than this. And yes I have the R code that generated these data if you want to see it.

I’ve Been Converted

Some Easter slide guitar, courtesy of one of the true masters thereof, Kelly Joe Phelps.

Well I know, yeah I know…that I’ve been converted…
Now do you?

You’ve got to know sir
That I’ve made me a change
That I’m not afraid to call my Savior’s name
Well I know…that I’ve been converted now…
Do you?

Kelly Joe Phelps, I’ve Been Converted

There may still be some time

I just closed my eyes again–
Climbed aboard the Dream Weaver train
Trying to take away my worries of today
And leave tomorrow behind

Fly me high through the starry skies
Take me to an astral plane
Across the highways of fantasy
Help me to forget today’s pain

Though the dawn may be coming soon
There may still be some time
Fly me away to the bright side of the moon
And meet me on the other side

Gary Wright, Dream Weaver

Harlan

I don’t know his name for sure but I think somebody said “Hi Harlan”” to him from their car stopped at an intersection. He is well known in those parts.

I’ve seen him for months on the streets, and since I’ve been here less than a year, I’d guess he’s been there much longer. He looks to be in his 60s, and walks with a single crutch at all times, such that your can hear him coming even when you don’t see him. He accosts almost every passer-by with a much garbled “Could you spare a dollar, I’m trying to get something to eat”. He seems to be reasonably successful, based on the number of people I see stopped with him, and as far as I can tell, he actually uses the money for food, not alcohol (a big issue with street beggars). He has, for all I can tell, not a soul in this world to count on for anything. I have no idea where he stays at night.

When I first saw him last summer he acted as described above. I had not seen him throughout the winter, yesterday being the first time I’d ventured out on the streets in that area. Harlan was still begging but now also yelling intermittently–incoherent phrases aimed at nobody in particular, and with gusto.

It was still chilly out, but I found a nice spot in the afternoon sun and set down my guitar and amp and plugged in and set up. I could hear Harlan coming down the sidewalk, and he pointed at my stuff and mumbled something incoherent that appeared to involve some danger of being arrested by the cops or something, I’m not sure. I replied “OK man” and continued with what I was doing, and Harlan moved on.

There was hardly anybody on the streets but it was downright comfortable for the first time in months so I started playing, for the practice if nothing else. I soon noticed a guy off to my right 10 yards, smoking a cigarette and listening. A few minutes later he came up and said he had no money with him but if he did he’d give me some, because I sounded great, quite similar to Pat Metheney. He was a bassist; he knew music and paid me other generous compliments. I replied no problem man, being compared favorably to Pat Metheny is worth more to me than dollar or two. He left but came back soon with two friends, threw in a tip and we all talked briefly about guitar favorites: Leo Kottke, Ry Cooder, and Chris Smither in particular. I convinced them that yes, they really should see Chris when he comes to town in May. And Leo Kottke’s song “I Yell at Traffic” came to my mind.

I resumed playing and a little later, Harlan came around the corner, yelling, stopped for a minute, and then sat down on some restaurant steps a few feet away. He stopped yelling and muttering to himself. He just say there, listening. Sensing this, I broke into a slow and deliberate rendition of arguably the most beautiful song I know, Bob Dylan’s Visions of Johanna, a waltz which I do in the key of A. Harlan continued to sit and gaze into the distance, in the warm sun and listen, the sound reverberating through the street. And then through another piece, before getting up and continuing his march. Hopefully, a few minutes of beauty and solace in an otherwise desperate existence.

If I see him again, I’m going to do it again, except that I’m going to try to play the best thing my fingers will generate.

Experts only

So, the IPCC has produced a special report on the issue of limiting the global temperature increase to 1.5 degrees C. This report is still open for comments for another 13 days…if you are an “expert” in the IPCC’s eyes. And what if you are not? Well if you’re American, you could still have commented, for a 30 day period that ended last week (Feb. 8), through a commenting system run by the United States Global Change Research Program (USGCRP)…assuming you actually knew about it.  And that latter issue is the topic of this post.

All IPCC report drafts are open to expert review, internationally, through a system the IPCC operates. In that system, you apply to be a reviewer by submitting your name and qualifications, which basically involves stating your expertise, including your degree and a list of up to five publications that demonstrate it. Then IPCC-associated folks say yes or no to your request.

But IPCC reports are also open to comments by national governments. The United States of course does so, the USGCRP administering this process.  But unlike the IPCC process, the USGCRP solicits comments from… anybody.  The notifications for these comment periods are required by law to be posted in the Federal Register, and the notice also appears on a USGCRP web page (corresponding links here and here; screenshots for the two below).
Fed Register

usgcrp2

At least for this report, the USGCRP also posted four Twitter notices, on January 16, 24, 29 and February 5, all identical.  Why they waited six days before the first notice I don’t know. Below is the Jan. 24 notice.

usgcrp1

You still have to register, but in that process you just select the category from a drop-down list that best describes your status, in one of five broad categories, screenshot below:
USGCRP Registration screen

I now encourage you to read the Federal Register notice linked to above. Notice exactly what it says. Specifically, even though the process is open to everyone, the entire notice, including the title (“Call for Expert Reviewers…”) is framed in the language of “expert” reviewer, the crux of which reads as follows:

As part of the U.S. Government Review, starting on 8 January 2018, experts wishing to contribute to the U.S. Government review are encouraged to register via the USGCRP Review and Comment System (https://review.globalchange.gov/?)… The USGCRP coordination office will compile U.S. expert comments and submit to the IPCC, on behalf of the Department of State, by the prescribed deadline. U.S. experts have the opportunity to submit properly formatted comments via the USGCRP Review and Comment System (https://review.globalchange.gov/?) from 8 January to 8 February 2018. To be considered for inclusion in the U.S. Government submission, comments must be received by 8 February 2018.

Experts may choose to provide comments directly through the IPCC’s Expert Review process, which occurs in parallel with the U.S. Government Review. Registration opened on 15 December 2017, and runs through 18 February 2018: https://www.ipcc.ch/?apps/?comments/?sr15/?sod/?register.php

The Government and Expert Review of the IPCC Special Report on Global Warming of 1.5 °C ends February 25, 2018.

Do you see any indication anywhere in any of it, that indicates that the commenting process is in fact open to the general citizens of the United States? I don’t. This is in fact only apparent when you actually go to the USGCRP Review and Comment page, and attempt to register, per the screen shot above. To say nothing of the fact that experts using the IPCC’s review system have 90 days to comment whereas those using the USGCRP’s have only 30.

OK, so then one day ~two weeks ago I was wasting my time and energy, which is to say I was reading Twitter comments, and I noticed a climate scientist, Katharine Hayhoe relay a message inviting “colleagues” to comment on the IPCC report (original comment here). In response, a climate activist, Steve Bloom, asked her directly (paraphrasing) “And what about people like me?”, meaning non-academics (and non-experts to the IPCC).

This conversation immediately went downhill, but the bottom line in this context is that Hayhoe either (1) had no idea that all Americans still had nearly another two weeks or so to comment on the report, or (2) she did know but didn’t tell him. I have no evidence for believing the latter, and so the logical conclusion is the former. I didn’t see the exchange until a few days later, but when I did I jumped in to alert everyone that yes indeed, any American citizen could still comment for another week or so. I also directly criticized Hayhoe for not knowing this, given that she was a lead author on a chapter of another report, the National Climate Assessment #4 that just went through the USGCRP review process. But after seeing how the USGCRP phrases their official notices (and Tweets) regarding their review process, I can surely see why she might not have known.

Hayhoe, who won the AGU’s “Climate Communication” award four years ago (with its $10,000 prize) made no response whatsoever to my comments—she simply blocked me on Twitter, meaning I can no longer read any of her comments there. No acknowledgement of the USGCRP process, no apology to Bloom, nothing. Her main comment in the process was to tell Bloom not to talk disrespectfully to climate scientists, adding that he’d been warned before, screen shot below.
Hayhoe Twitter comments
Steve Bloom–no, no response from him either. The only person to comment at all on what I said was Richard Betts, a UK climate scientist who stated that it was interesting to learn that the United States allowed all citizens to comment on IPCC reports. Maybe the United States, unlike the IPCC, understands that having something important to say, is not limited to “experts”, whatever the latter entails exactly. Volumes could be written on that topic alone, but that’s not for the here and now.

So, this is just one example of the kind of thing we’re dealing with in the whole climate change public outreach circus, or tragedy, whichever it is. But it’s one thing if it’s just an entertaining circus, and another thing altogether if your so-called “climate communicators” can’t communicate crucial facts about the public interaction process.

One hundred years of NHL hockey; some analysis

This post has been updated, with corrected data and modified discussion, as detailed in the text.

Does anything say “100 Years of the National Hockey League” like say, a Tampa Bay vs Vegas matchup.  Montreal, Toronto, Ottawa?  Please; bunch of party crashers them.

In case you missed it, National Hockey League play is, today, exactly 100 years old. On December 19, 1917 the first two games in the new league had the Toronto Arenas at the Montreal Wanderers, and the Montreal Canadiens at the Ottawa Senators. This limited slate was due in large part to those being the only four teams in the league.  It turns out that the Wanderers got their first and only win in franchise history, which lasted just six games. They got past the Arenas in the common hockey score of 10-9. The Arenas, conversely, went on–along with the Canadiens–to become one of the two most storied franchises in NHL history: today’s Toronto Maple Leafs. The Senators’ first incarnation lasted until 1934, and after a 58 year absence came the second (and current) version in 1992.

So anyway, there’s hype and hoopla happening, and also discussions of the greatest seasons, teams, players, etc. As for me, I thought it would be great fun to crunch 90 years of team-season numbers to see what they indicated about team records, actual versus expected. Two minutes for tripping, and without even inhaling anything.

Continue reading

You would not think, just to look at him…

So yesterday I was riding the bus, which I only do when I need to tote both my guitar and amp downtown. The two-plus miles is just a little too far for the ~60 pound carry, especially given an injured shoulder and wrist, and sidewalks that are a mess from a foot of snow last week.

A couple of stops  after boarding, on steps a guy with a very tattered beige coat, like something that might have been involved in say, some street fights, or use as a dog’s bed. He was dragging a heavy-looking plastic bag full recyclables, and sat down next to me.

“Play the guitar, eh?”
“Yeah” says I.
“What kind of stuff you like?”
“Acoustic 12…a lot of Bob Dylan, but also John Gorka, Chris Smither, Greg Brown, the Dead, some Zeppelin…some of my own stuff too.”

In the five minutes before he got off, he told me the abridged version of how he once played a lot, both guitar and keyboards, apparently as a professional musician, including a lot of local shows at various venues, with a band was busy and popular, mostly back in the 1980s and 90s I gathered. He said he made good money at it and even shared the bill with some well-known bands/acts, such as Mitch Ryder, Steppenwolf, and (I think) Dave Alvin’s band (The Blasters?).   About how Ryder once got quite upset with him, when his band was supposed to be opening his show but he was instead drunk in a local bar, having completely forgotten about it.  His band mates had to track him down, and the resulting delay caused Ryder to open for him, instead of vice-versa. He smiled at the memory.

I asked him if he was still performing or playing. He talked for a couple of minutes–about how that’s all gone now. He lives on disability and food stamps, supplemented I guess, by whatever he gets from collecting and hauling recyclables via foot and bus, and street begging, which he said he makes some money at.

“Yeah, I could…but damn alcohol…” he said.

As he got up to get off, I invited him to bring his guitar and we could jam together on the street. He said that would be cool, and would do so. There wasn’t time to get his name or number.  Hope I see him again.

Now you would not think, just to look at him
But he was famous long ago
For playing the electric violin
On Desolation Row

 

WAR, Pythagoras, Poisson and Skellam

Getting into some issues only makes you wish that you hadn’t, when you realize how messed up they are, at a fundamental level.

Here’s a great example involving statistical analysis, as applied to win/loss (“WL”) records of sports teams, the base concept of which is that it’s possible to estimate what a team’s WL record “should” have been, based on the number of goals/runs/points that it scored, and allowed, over a defined number of games (typically, a full season or more). This blog post by Bill James partially motivates my thoughts here.

Just where and when this basic idea originated I’m not 100 percent sure, but it appears to have been James, three to four decades ago, under the name “Pythagorean Expectation” (PE). Bill James, if you don’t know, is the originator, and/or popularizer, of a number of statistical methods or approaches applied to baseball data, which launched the so-called “SABR-metric” baseball analysis movement (SABR = Society for American Baseball Research). He is basically that movement’s founder.

In the linked post above, James uses the recent American League MVP votes for Jose Altuve and Aaron Judge, to make some great points regarding the merit of WAR (Wins Above Replacement), arguably the most popular of the many SABR-metric variables. The legitimacy of WAR is an involved topic on which much virtual ink has been spilled, but is not my focus here; in brief, it tries to estimate the contribution each player makes to his team’s WL record. In the article, James takes pointed exception to how WAR is used (by some, who argue based upon it, that the two players were basically about equally valuable in 2017). In the actual MVP vote, Altuve won by a landslide, and James agrees with the voters’ judgement (pun intended): WAR is flawed in evaluating true player worth in this context. Note that numerous problems have been identified with WAR, but James is bringing a new and serious one, and from a position of authority.

One of James’ main arguments involves inappropriate use of the PE, specifically that the “expected” number of wins by a team is quite irrelevant–it’s the *actual* number that matters when assessing any given player’s contribution to it. For the 2017 season, the PE estimates that Judge’s team, the New York Yankers, “should” have gone 101-61, instead of their actual 91-71, and thus in turn, every Yanker player is getting some additional proportion of those ten extra, imaginary wins, added to his seasonal WAR estimate. For Altuve’s team, the Houston Astros, that’s not an issue because their actual and PE WL records were identical (both 101-61). The WAR-mongers, and most self identified SABR-metricians for that matter, automatically then conclude that a team like this year’s Yanks were “unlucky”: they should have won 101 games, but doggone lady luck was against ’em in distributing their runs scored (and allowed) across their 162 games…such that they only won 91 instead. Other league teams balance the overall ledger by being luck beneficiaries–if not outright pretenders. There are major problems with this whole mode of thought, some of which James rips in his essay, correctly IMO.

But one additional major problem here is that James started the PE craze to begin with, and neither he, nor anybody else who have subsequently either modified or used it, seems to understand the problems inherent in that metric. James instead addresses issues in the application of the PE as input to the metric (WAR) that he takes issue with, not the legitimacy of the PE itself. Well, there are in fact several issues with the PE, ones that collectively illustrate important issues in statistical philosophy and practice. If you’re going to criticize, start at the root, not the branches.

The issue is one of statistical methodology, and the name of the metric is itself a big clue–it was chosen because the PE formula is similar to the Pythagorean theorem of geometry: A^2 + B^2 = C^2, where A, B and C are the three sides of a right triangle. The original (James) PE equation was: W = S^2 / (S^2 + A^2), where W = winning percentage, S = total runs scored and A = total runs allowed, summed over all the teams in a league, over one or more seasons. That is, it supposedly mimicked the ratio of squared lengths between one side, and the hypotenuse, of a right triangle. Just how James came to this structural form, and parameter values, I don’t know and likely very few besides James himself do; presumably the details are in one of his annual Baseball Abstracts from 1977 to 1988, since he doesn’t discuss the issue that I can see, in either of his “Historical Baseball Abstract” books. Perhaps he thought that runs scored and allowed were fully independent of each other, orthogonal, like the two sides of a right triangle. I don’t know.

It seems to me very likely that James derived his equation via fitting various curves to some empirical data set, although it is possible he was operating from some (unknown) theoretical basis. Others who followed him, and supposedly “improved” the metric’s accuracy definitely fitted curves to data, since all parameters (exponents) were lowered to values (e.g. 1.81) for which no theoretical basis is even possible to conceive of: show me the theoretical basis for anything that scales up/down according to the ratio of a sum of parts, and one component thereof, by the power of 1.81. The current PE incarnation (claimed as the definitive word on the matter by some) has the exponents themselves as variables, dependent on the so-called “run environment”, the total number of runs scored and allowed, per game. Thus, the exponents for any given season are estimated by R^0.285, where R is the average number of runs scored per game (both teams) over all games of a season.

Even assuming that James did in fact try to base his PE on theory somehow, he didn’t do it right, and that’s a big problem, because there is in fact a very definite theoretical basis for exactly this type of problem…but one never followed, and apparently never even recognized, by SABR-metricians. At least I’ve seen no discussion of it anywhere, and I’ve read my share of baseball analytics essays. Instead, it’s an example of the curve-fitting mentality that is utterly ubiquitous among them. (I have seen some theoretically driven analytics in baseball, but mostly as applied to ball velocity and trajectory off the bat, as predicted from e.g., bat and ball elasticity, temperature, launch angle, and etc, and also the analysis of bat breakage, a big problem a few years back. And these were by Alan Nathan, an actual physicist).

Much of science, especially non-experimental science, involves estimating relationships from empirical data. And there’s good reason for that–most natural systems are complex, and often, one simply does not know, quantitatively and apriori, the fundamental operating relationships upon which to build a theory, much less how those interact with each other in complex ways at the time and space scales of interest. Therefore one tries instead to estimate those relationships by fitting models to empirical data–often some type of regression model, but not necessarily. It goes without saying that since the system is complex, you can only hope to detect some part of the full signal from the noise, often just one component of it. It’s an inverse, or inferential, approach to understanding a system, as opposed to forward modeling driven by theory; these are the two opposing approaches to understanding a system.

On those (rare) occasions when you do have a system amenable to theoretical analysis…well you dang well better do so. Geneticists know this: they don’t ignore binomial/multinomial models, in favor of curve fitting, to estimate likely nuclear transmission genetic processes in diploid population genetics and inheritance. That would be entirely stupid, given that we know for sure that diploid chromosomes conform to a binomial process during meiosis the vast majority of the time. We understand the underlying driving process–it’s simple and ubiquitous.

The binomial must be about the simplest possible stochastic model…but the Poisson isn’t too far behind. The Poisson predicts the expected distribution of the occurrence of discrete events in a set of sample units, given knowledge of the average occurrence rate determined over the full set thereof. It is in fact exactly the appropriate model for predicting the per-game distribution of runs/goals scored (and allowed), in sports such as baseball, hockey, golf, soccer, lacrosse, etc. (i.e. sports in which scoring is integer-valued and all scoring events are positive and of equal value).

To start with, the Poisson model can test a wider variety of hypotheses. The PE can only predict a team’s WL record, whereas the Poisson can test whether or not a team’s actual runs scored (and allowed) distribution, follows expectation. To the extent that they do follow is corresponding evidence of true randomness generating the variance in scores across games. This in turn means that the run scoring (or allowing) process is stationary, i.e., it is governed by an unchanging set of drivers. Conversely, if the observed distributions differ significantly from expectation, that’s corresponding evidence that those drivers are not stationary, meaning that teams’ inherent ability to score (and/or allow) runs is dynamic–they change over time (i.e. between games). That’s an important piece of knowledge in and of itself.

But the primary question of interest here involves the WL record and its relationship to runs scored and allowed. If a team’s runs scored and allowed both closely follow Poisson expectations–then prediction of the WL record follows from theory. Specifically, the distribution of differences in two Poisson distributions follows the Skellam distribution, described by the British statistician J.G. Skellam in the 1950s, as part of his extensive work on point processes. That is, the Skellam directly predicts the WL record whenever the Poisson assumptions are satisfied. However, even if a team’s run distribution deviates significantly from Poisson expectation, it is still possible to accurately estimate the expected WL record, by simply resampling–drawing randomly several thousand times from the observed distributions–allowing computers to do what they’re really good at. [Note that in low scoring sports like hockey and baseball, many ties will be predicted, and sports differ greatly in how they break ties at the end of regulation play. The National Hockey League and Major League Baseball vary greatly in this respect, especially now that NHL ties can be decided by shoot-out, which is a completely different process than regulation play. In either case, it’s necessary to identify games that are tied at the end of regulation.]

If instead you take an empirical data set and fit some equation to those data–any equation, no matter how good the fit–you run the risk of committing a very big error indeed, one of the biggest you can in fact make. Specifically, if the data do in fact deviate from Poisson expectation, i.e. non-stationary processes are operating, you will mistake your data-fitted model for the true expectation–the baseline reference point from which to assess random variation. Show me a bigger error that you can make then that one–it will affect every conclusion you subsequently come to. So, if you want to assess how “lucky” a team was with its WL record, relative to runs scored and allowed, don’t do that. And don’t get me started on use of the term “luck” in SABR-metrics, when what they really mean is chance, or stochastic, variation. The conflation of such terms in sports that very clearly involve heavy doses of both skill and chance, is a fairly flagrant violation of the whole point of language. James is quite right in pointing this out.

I was originally hoping to get into some data analysis to demonstrate the above points but that will have to wait–the underlying statistical concepts needed to be discussed first and that’s all I have time for right now. Rest assured that it’s not hard to analyze the relevant data in R (but it can be a time-consuming pain to obtain and properly format it).

I would also like to remind everyone to try to lay off high fastballs, keep your stick on the ice, and stay tuned to this channel for further fascinating discussions of all kinds.  Remember that Tuesdays are dollar dog night, but also that we discontinued 10 cent beer night 40 years ago, given the results.