# June US historical precipitation, a quick analysis

The NCDC June climate summary for the United States is out. June’s a real important month in the Northern Hemisphere, especially agriculturally. I’ll use these data as a chance to demonstrate the probability that June 2014 precipitation (P) in the US is drawn from a stationary distribution (over 1895-2014, 120 years, CONUS), using both state- and region-based P rankings, and 250k Monte Carlo simulations. [There are 97 years of record for Alaska, and Hawaii’s not included.] If precipitation is stationary over that time, we expect the state rankings (normalized to a 0-1 scale) not to differ significantly from 0.5. Easier still, although a slightly different question, is to evaluate the probability of having 9 states rank among their 10 wettest (as was observed), as then we only need that information, not the actual rankings of each state.

Below are the graphics showing the rankings (120 = wettest; Alaska, not shown, had the 2nd wettest June in its 97 year record):

One nice thing about this is that only a few lines of code are needed. Here it is for the first situation:

```## 1. State-based. Probability that 9 (of 49) states' June 2014 precip ranks among the state's wettest.  Hawaii not included.
# Non-Alaska: 120 years in the NCDC record (1895-2014).  Alaska: 97 year record (1918-2014)
# 1=dryest, 120=wettest; all rankings normalized to 0-1 scale; test stat = 111/120 = .9175.

rm(list=ls()); options(digits=3)
trials=250000; data=rep(NA,trials)
for (i in 1:trials) {z1=sample(seq(.001,1,.001),size=49,replace=F); data[i]=length(z1[z1 >= 111/120])}
(p = length(data[data>=9])/trials)
```

States are not all the same size, so we should normalize accordingly. A quicker approximation is just to use climate regions, which are more roughly equal in size than the states are. However, there are only ten of them, so it might be better to look at their central tendency and dispersion, rather than the number placing in the ten wettest years. [Of course, for both analyses, it would be even better to use the actual P values themselves, instead of their ranks, but with 120 years of data, this will be a good approximation of that].

```## 2. Region-based. Probability that mean and std. dev. of regional (including AK) June 2014 precip ranks exceed expectation under hypothesis of stationarity (no change).  Hawaii not included.
regn.ranks = c(c(88,40,107,120,112,97,23,13,50)/120, 96/97)
par1=mean(regn.ranks); par2=sd(regn.ranks)
trials=250000; data=matrix(NA,nrow=trials,ncol=2)
for (i in 1:trials){
z1=sample((1:1000)/1000,10,F)
data[i,1]=mean(z1); data[i,2]=sd(z1)
print(i)
}
p.mean = length(data[,1][data[,1]>=par1])/trials
p.sd = length(data[,2][data[,2]>=par2])/trials
```

OK, so the results then. The state-based analysis (top) returns a value of p = 0.009, or just under 1%, for the probability of having 9 states out of 49 rank in the top 10 of their historical records. The region-based analysis gives p = 0.063 for a stationary mean, and p = 0.098 for a stationary standard deviation, at the region level, thus neither quite reaching the standard p = .05 signficance level, but both getting there. Remember, p = 0.5 would be the expected value for each metric under a stationary June precipitation; values deviating therefrom, either way, indicating evidence for dynamics. Note also that this is not a trend analysis; for that you would need the time series of either the P values or the rankings for each state or region.