Karl et al 2015, updated analysis

Update: As mentioned in the piece below, I had suspicions that the data at the NOAA ERSST v4 webpage were not the correct (version 4) data, and it turns out that this is in fact the case. I learned this after Gavin Schmidt of NASA forwarded me data he had been sent directly from NOAA personnel. So, I need to do the analysis below again with the version 4 data, when I get time. Needless to say, NOAA needs to put correct data on their web pages, especially when it forms the basis of a paper making this magnitude of a claim. It’s also telling, and depressing, that of the large number of scientists commenting on this paper, none had apparently gone to get the actual data with which to check Karl et al’s analysis. If they had, they’d have noticed the discrepancy against what Karl et al report.
End Update

And the situation’s worse than I had first thought. Unfortunately.

As I mentioned in the update to previous post, Gavin Schmidt of NASA brought to my attention on Twitter the fact that you cannot average trend rates obtained via Ordinary Least Squares (OLS), between two consecutive time periods, and assume that value will equal the trend computed over the combined periods. I was confused by this at first, but Gavin made a plot showing exactly what the issue is:
OLS trend averaging Nothing forces the two trend lines to be continuous at the junction of the two intervals, and a big jump or drop in the data at or near that junction, with suitably short sample sizes, will cause this to happen. A slight variation of this is just what I did in comparing trend rates over the time intervals that Karl et al discussed. I did so because they did not include the rates, in their Table S1, for all of the intervals they discussed, so I had to make algebraic estimates. [Note that Table S1 is in fact a big problem, as it gives a hodgepodge of time periods that do not in fact allow one to make a direct and clear comparison of rates between consecutive time intervals, the very point of the paper. And their Figure 1 and discussion do not help.]

So, here I compute and report the OLS-derived trend estimates for Karl et al.’s variously defined intervals, and some others they didn’t discuss, using the new NOAA data directly. These intervals are defined by: (1) four start dates (1911, 1931, 1951, 1971), (2) two breakpoint years (1998 and 2000), and two end years (2012 and 2014); this creates 16 possible pairs of time intervals. For each I compute the OLS-derived trend (degrees C per decade) for the two consecutive intervals, and the ratio between them, as the second (“slowdown”) over the first (pre-“slowdown”). [Note that the mismatch issue arises because an OLS-derived straight line is the “stiffest” possible model one can fit to a time series. Using something more flexible like lowess or a polynomial model, would lessen this problem, but then you run into other issues. Since OLS is “BLUE” (Best Linear Unbiased Estimator), and minimizes the possibility of over-fitting the data (very important here!), and since Karl et al used it, I stick with it here. Also, I am only presenting data on the trend estimates themselves, not on their probabilities relative to a null, or any other possible, model. Doing that is trickier and more time consuming than people might think, time I don’t have right now.]

First, the new NOAA annual GMST data, from 1880 on, look like this:

Second, the trend analysis results:

   Start1 End1 Start2 End2 Slope1 Slope2 Ratio
2    1911 1997   1998 2012 0.0685 0.0342 0.499
3    1911 1997   1998 2014 0.0685 0.0560 0.817
4    1911 1999   2000 2012 0.0719 0.0390 0.542
5    1911 1999   2000 2014 0.0719 0.0649 0.903
6    1931 1997   1998 2012 0.0588 0.0342 0.582
7    1931 1997   1998 2014 0.0588 0.0560 0.952
8    1931 1999   2000 2012 0.0650 0.0390 0.600
9    1931 1999   2000 2014 0.0650 0.0649 1.000
10   1951 1997   1998 2012 0.0938 0.0342 0.365
11   1951 1997   1998 2014 0.0938 0.0560 0.597
12   1951 1999   2000 2012 0.1026 0.0390 0.380
13   1951 1999   2000 2014 0.1026 0.0649 0.633
14   1971 1997   1998 2012 0.1584 0.0342 0.216
15   1971 1997   1998 2014 0.1584 0.0560 0.354
16   1971 1999   2000 2012 0.1714 0.0390 0.227
17   1971 1999   2000 2014 0.1714 0.0649 0.379

The table shows several important results. First, there is only one case in which the slope of the slowdown period is as large as that of the pre-slowdown period, given by the eighth row of the table (start = 1931, break = 2000, end = 2014). In two other cases the ratio exceeds 0.90 (rows 5 and 7) and in one more case it exceeds 0.8 (row 3). For the 1951 start date most emphasized by Karl (rows 10-13), no ratio higher than 0.63 results, no matter how the intervals are defined. The mean and median of the 16 ratios are both about 0.56, so there’s no skew issue.

If one’s goal is to compare the data against the trends and discussions given in the AR5, that is, to assess the effects of the bias reductions in the new NOAA data on possible rate changes, then 2012 is of course the end year to use. For the two possible break years, those ratios are given in rows 10 and 12, as 0.365 and 0.380 respectively: the “slowdown” interval rates are thus ~ 35 to 40 percent of the pre-slowdown rates. In the previous post, in which I had to estimate the trend rate in the first interval algebraically, I got estimates of about 65-68%. So…the problem’s actually worse than I first estimated, and by quite a bit, 35-40% instead of 65-68%. And there are some serious discrepancies in my trend values vs theirs. For example, Karl et al report the trend from 2000 to 2014 as 0.116; I get 0.0649. For 1998-2012 they report 0.086 and I get 0.034. The numbers I’m getting are much closer to their reported “old” (bias uncorrected) values, than to their “new” ones, but they don’t exactly match those either. And the data link names provided by NOAA at their ERSSTv4 page do not seem to refer to version 4. So, some problems here.

If one’s goal is instead to assess what the combined effects of the bias reductions and the extra two years of data is, relative to the general question of whether the rate of global warming has slowed over the last ~1.5 decades or not, then rows 11 and 13, which go to 2014, give higher estimates: 60 to 63 percent post/pre. So those two years of extra data heavily affect the answer to that general question. This same basic story holds true for other start dates. And the further back you push the start date, the less the rate difference between the two periods, for any given choice of end and breakpoint dates.

Again, no significance testing here. It is important to remember though, that each year’s GMST value derives from many hundreds of data points, even if reduced to an effective size by accounting for autocorrelation, which certainly exists. I much doubt that any except the very highest ratios given here would prove to be statistically insignificant if the trend rates were computed from those raw data, instead of from the annual means, and even using the latter, they still would not be in many cases.

So, these results seem to make the principal conclusion of Karl et al, that there has been no slowdown in the rate of global warming, even more suspect than I reported in the previous post. The bias reductions and extra years’ data reduce the differences, but they most certainly do not eliminate them.

The R code for the analysis is below

## 1. Get and read data
# The ERSSTv4 GMST data from 1880 on is at: http://www1.ncdc.noaa.gov/pub/data/mlost/operational/products/aravg.ann.land_ocean.90S.90N.v3.5.4.201504.asc
# Download data and set appropriate working directory
T = read.delim("NOAA ERSST aravg.ann.land_ocean.90S.90N.v3.5.4.201504.asc.txt", sep="", header=F)

## 2. Define time periods; start year = 1951; breakpoint year at either 1998 or 2000, end year at either 2012 or 2014
# breakpoint year always included in *2nd* of the 2 intervals
# 1998 and 2000 useful choices--differing global means
# start years: 40 and 20 years before, and 20 years after, Karl et al's emphasis on 1951; thus covers last 100+ years
start.yr = c(1911,1931,1951,1971); break.yr = c(1998,2000); end.yr = c(2012,2014)
dates = expand.grid(start.yr,break.yr,end.yr)
dates[,4] = dates[,2]-1; dates= dates[,c(1,4,2,3)]; names(dates) = c("Start1","End1","Start2","End2")
which.dates = dates-1879; n = nrow(dates)
intvl.1=paste("int",1:n,".1",sep=""); intvl.2=paste("int",1:n,".2",sep="")
for (i in 1:n){
 assign(intvl.1[i], which.dates[i,1]:which.dates[i,2])
 assign(intvl.2[i], which.dates[i,3]:which.dates[i,4])

## 3. Get OLS-derived lm and slope estimates, ignoring possible AC for now, since no p testing
lm.1 = paste("lm",1:n,".1",sep=""); lm.2 = paste("lm",1:n,".2",sep="")
sl.1 = paste("sl",1:n,".1",sep=""); sl.2 = paste("sl",1:n,".2",sep="")

for (i in 1:n){
 assign(lm.1[i], lm(T[get(intvl.1[i]), 2] ~ T[get(intvl.1[i]), 1]) )		# OLS lm fits
 assign(sl.1[i], get(lm.1[i]) [[1]][[2]])							# slope only
 assign(lm.2[i], lm(T[get(intvl.2[i]), 2] ~ T[get(intvl.2[i]), 1]) )
 assign(sl.2[i], get(lm.2[i]) [[1]][[2]])

Ratios = rep(NA,n); Slope1=Ratios; Slope2=Ratios
for (i in 1:n){
 Slope1[i] = get(sl.1[i]); Slope2[i] = get(sl.2[i])
 Ratios[i] =  Slope2[i] / Slope1[i]
final = cbind(dates, Slope1, Slope2, Ratios)
final = final[order(final[,1], final[,2], decreasing=F) ,]
rownames(final) = 1:n; final
plot(T[,1], T[,2], xlab="Year", ylab= "ERSSTv4 GMST (deg. C) rel. to 1971-2000 mean", type="l", lwd=2)

Have at it

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s