Monday, August 16, 2010

A Flat New Puzzler

The question dear bunnies, is why does this figure from Tamino's post on change have the answer to what is going on in the flat new paper on proxy deconstruction by McShane and Wyner (ps there are lots of hints in the comments at Open Mind but here is another one, look at the figure and then remember that McShane and Wyner are claiming that noise gives a better match to the global temperature series than proxys) Also take a look at the comments at Policy Lass

Eli's answer later tonight.

UPDATE: Eli does have a day job folks......

When you calibrate, respectable Rabett's want the largest spread of the variable calibrated against as possible. M&W calibrated proxies that respond to regional changes against the GLOBAL temperature record. If you look at Tamino's figure, for about 80 of ~120 years (M&S only go to 2000, there ain't a lot of proxies that go to 2010), a flat line is about the best description of what happened. This covers the period from ~1880 - 1920 and ~ 1940 - 1980. In such a situation, random noise is the best description of the variation. So especially if you hold out the last 30 years and prattle on about, bunnies find that random noise about a straight line provides the best fit, which is what the boys find, and of course it does not capture the sharp rise in the last30 year period. As they say,

In other words, our model performs better when using highly autocorrelated noise rather than proxies to predict temperature. The real proxies are less predictive than our "fake" data.
Since the proxies are affected by the local temperatures (and precipitation and some other things that are all local) and the local temperatures varied much more strongly than the global in most cases, and surely for those cases where the proxies vary strongly, this is kindergarten work. Trivially, this procedure underestimates the response of the proxies to the temperature over long periods and exaggerates the error projected back to the year dot. You are fitting noisy data to a straight line to find a slope? C'mon. QED

Tim Lambert has more, and, as Eli said, the comments at Policy Lass and Tamino's are fine. Judith Curry is sitting on the sticky wicket at Kloor's and the unusual suspects, well, there is a reason they are unusual.

UPDATE: Martin Vermeer says it more clearly at Deltoid

Oops.

My explanation at link for why the M&W reconstruction is erroneous, was a little too simple. It's the Wabett who gets it completely right: the fundamental error is calibration only against a hemispheric average, when local data -- the 5x5 degree grid cells of instrumental data as used by Mann et al. -- provide a so much richer source of variability -- i.e., signal -- to calibrate against.

It is this poor signal/noise ratio that helps the calibration pick up spurious low vs. high latitude temp difference "signal", which in the reconstruction interacts with the Earth axis tilt change effect.

What stands is the observation that doing the calibration against the instrumental PC1 (instead of hemispheric average) will give you back pretty exactly the genuine Mann stick(TM) even in spite of this.

Congrats Eli!

Evidently there will be a comments on this paper at the journal (they select a few people to comment). Hopefully, much of the back and forth, with the cheerleading and booing edited out will be picked up.

39 comments:

Anonymous said...

"Eli's answer later tonight."

Oh great (whistles) that means I don't get to know until tomorrow...

joabbess said...

...umm...keep checking...can't wait...bated bunny breath...

Anonymous said...

Clearly there is noise, even Tamino considers linear fits wrong but useful. Tamino looks for the turns, and I see failure to downturn. But how do you plot a failure to down turn in a chaotic system.

Too early to say but it appears that we have gone from warming and cooling to warming and static. Given the continued increase in CO2 how long before it turns to warming and faster warming. A record warm year when the sun is still so quiet, more than a little scary.

Is a Canfield State now inevitable?

Little Mouse

PolyisTCOandbanned said...

I see a candlestick and triple widow's peak. Clear sell indicator. ;)

Tamino is kinda right when he mephasizes longer range trends over short deviations. When he tries to read in too much change points and the like, he goes to the well too much. Should listen to himself before.

P.s. I hate the whole "tease" method of blogging. It's so McIntyre. Make your assertion and we will chew on it.

Anonymous said...

I hate the whole "tease" method of blogging.

Agony... Both Rabett and Deep Climate are torturing us tonight...

Anonymous said...

Not, perhaps, relevant to your main question, but I wonder how/whether the mid-century dimming/brightening from aerosols may contaminate the temperature signal in proxies... has anyone looked into that? I think that CO2 fertilization effects have been removed from dendro signals, but don't know about aerosols... (and dimming could theoretically impact more than just biogenic proxies).

-M

Anonymous said...

"Judith Curry is sitting on the sticky wicket at Kloor's"

Probably honing her googly's.

AnnieNomNomNom

Jim Crimmins said...

This isn't right, unless you believe that the proxies work better with auto-correlated temperature patterns (trends). If so, please explain the physics behind that. In other words, would the proxies have a higher correlation with an annual temperature series of (+1,-1,+1,-1) or (1,1,1,1)? Why? What are the physics behind that? It does makes sense that the proxies should work better with *larger* temperature movements, as that should drown out the other confounding factors you mention, but as we have seen during the late 20th century - they don't. Oh well.

Martin Vermeer said...

Actually this argument only goes so far. Yes, the local temperatures vary more, but their coefficients are not the ones you're interested in. They get averaged when you look at the hemispherical temperatures. And remember that the whole computation is linear in the instrumental temps: you can just add a row representing hemispheric mean to the RegEM computation tableau. Then it shouldn't make any difference whether you first average instrumental values and then reconstruct, or reconstruct per grid cell and then average. The outcome, and uncertainties, should be the same.

SteveF said...

OT, but Judy Curry has an interesting new paper out:

http://www.pnas.org/content/early/2010/08/09/1003336107.abstract

It gets a mention at WUWT and the cretinous commenters get their teeth stuck into it. I wonder if reading these comments will force Judy to realise that some people just can't be reasoned with:

http://wattsupwiththat.com/2010/08/16/georgia-tech-on-resolving-the-paradox-of-the-antarctic-sea-ice/

Nick said...

Funny line of the day from commentor at Deltoid regarding the new paper by M&W:

"Yep it does not take much to get these people excited. Wonder if SteveM is going to audit this paper to make sure the stats are OK?"

Angliss said...

It think that this is an error in JeffID's criticisms of Mann2008 as well. As far as I can tell by reading JeffID's methodology, he ran his calibrations against the measured global mean temperature (and claimed that doing so disproved one of the methods Mann08 uses) while Mann08 ran calibrations against the nearest available temperature data. That is a pretty significant difference.

I pointed that out to him months ago in a comment exchange at S&R, but I don't know if JeffID's updated his claims after having corrected his oversight.

Jim Crimmins said...

I don't think this local/global calibration will matter one whit after the whole process is completed, given the linearity of the both sides of the model, but it would be good to check that with the exact process used, etc.

In any case, one of the main points of the paper is that because of the weak signal in the proxies you can get almost any shape you want within the error bars depending on meta-assumptions. The reason the shouting is so loud about paleo is because it's impossible to decide objectively about all these meta and modeling assumptions.

In summary, because of the weak signal, and the sensitivity to assumptions, it's (mostly) not science anymore.

Anonymous said...

OT, but what does this mean?...October the 9th, 2007. That was the day that the Dow Industrials hit the, All-Time-High: 14,164.

http://www.nyse.tv/djia-chart-history.htm

Six-days-Later; on October the 15th 2007, the first Baby-Boomer(:) filed to start her social security.

http://www.reuters.com/article/idUSN15383509

What are the odds? You all are very good with numbers... Another puzzler.

How can it be a Ponzi scheme when the "citizen" guarantees everything from our GSE's & PBGC, SS, T's... Right?

This seems to me, to be the more pressing issue at this point in time.

seamus said...

Maybe M&W should have worked with someone who knows a little something about the various proxies and how to handle them statistically... you know, maybe a climate scientist or two?

Hank Roberts said...

> Judith Curry ... Wattsup comments

Seems the commenters imagine themselves as bullfighters over there:

"August 17, 2010 at 5:35 am
... we have the ear of Judith Curry ..."

Lars Karlsson said...

M&W do also consider gridded calibration in section 3.6 (p 23-24) and then get a performance comparable to the best sophisticated random series.

enodo said...

Martin:

I don't understand your comments. What we want to determine is how sensitive tree ring widths (or whatever proxy) is to the temperature. This is what we're looking to calibrate when we use the proxy to extend our temperature measurements into the past. The temperature the tree sees is the *local* temperature, not the hemispheric average. I don't see what calibration you get by comparing a local proxy measurement to the hemispheric average

Deech56 said...

SteveF, you made my day. Judith and her coauthor checked in for a bit; she offered to respond to questions, but found the pickings slim. Got in a zinger (after over 100 comments), though: "Still scanning for some thoughtful questions that I will respond to."

dhogaza said...

"SteveF, you made my day. Judith and her coauthor checked in for a bit; she offered to respond to questions, but found the pickings slim."

Judith, welcome to your tribe ...

Deech56 said...

IIRC, Judith has been critical of WTF, preferring the CA/BH tribe. No thread at CA on Liu & Curry, yet. That would be interesting to see.

Rattus Norvegicus said...

It looks like Judith is getting subject to "blog science". Perhaps she'll come to her senses now...

dhogaza said...

"IIRC, Judith has been critical of WTF, preferring the CA/BH tribe. No thread at CA on Liu & Curry, yet. That would be interesting to see."

Sorry, CA/BH forms a *subtribe*. She doesn't get to designate tribalism as being the Big Evil in climate science while splitting hairs over barely-literate/illiterate members of her own tribe.

Anonymous said...


Then it shouldn't make any difference whether you first average instrumental values and then reconstruct, or reconstruct per grid cell and then average. The outcome, and uncertainties, should be the same.


I may be talking through my hat here, but I presume that reconstruction involves some sort of a least-squares fit. If that's the case, then reconstruction per grid cell followed by averaging is not the same as averaging and then reconstructing. a^2 + b^2 + c^2 != (a + b + c)^2

If it's not the case, then nevermind...

Anonymous said...

enodo,

what I mean is this. Yes, you calibrate a proxy against local temperature. Global/hemispherical termperature is an average of local temperature. So you get the coefficient of the set of proxies in the global temp as the average of the coefficients to local temperatures. This is the repeatedly taken small difference of larger numbers. So, doing it locally doesn't help.

More formally, multiple regression is linear in y:

y|x = y - cov(y,x) * var(x)^-1 * x .

If y is a sum of little y's, cov(y,x) will be a sum of little covariances, and similarly for y|x. You can do the summation on the left hand side or the right hand side, with the same result. For y, read temps, for x, proxies.

Martin

Deech56 said...

Dhogaza, as long as she doesn't try to split hares.

It's like marrying into a family that has a whole bunch of crazy uncles stuffed in the attic. There's a second thread where Willis E. puts the paper through its paces.

captdallas2 said...

This paper seems to be stimulating a lot of interest. So I think I will put in my, not so well educated, couple of pennies. Various proxies have a fairly limited direct relationship to ambient temperature. Tree ring growth, for example, is estimated to be approximately 30% dependent on temperature. So 70% of the ring information is noise, if you are trying to make a temperature reconstruction. That should indicate that the confidence intervals for dendro reconstructions should have fairly ample error bars.

Other proxies, for example the Sargasso sea study done by Wood's hole, have a better direct relationship to temperature, sea surface temperature in this case. Then this is a low frequency reconstruction that doesn't blow much wind up the skirts of climate scientists searching for high frequency proxies. I find that to be a bit counter intuitive if attempting a multi-millennial reconstructions. The low frequency provides a natural smoothing.

Another issue I have is that comparing a reconstruction to hemispherical temperature records instead of regional temperatures is a bogus argument. Temperatures within the region of the collected proxy data should be compared to the proxy to determine its skill.

Then I am just a fisherman.

Michael Tobis said...

I keep flickering back and forth, but I think Martin's and cap'n Dallas's and Jim C's point is valid.

Presuming things are linear, it should make no difference whether we calibrate globally or locally to get the global signal. The question is one big source of noise or two smaller ones, but the result ought to be the same. If the algorithm isn't entirely linear (and I confess I don't quite follow what they are doing) then it can only make a difference to the extent that the nonlinearity dominates.

I also think their result is entirely consistent with the spaghetti diagram. The fact that they are socially connected to the bad guys only strengthens the result. They looked at the data from a purely statistical point of view with no priors whatsoever, implicit or explicit, and they got a hockey stick and a caveat that they don't know how wiggly the shaft of the stick is. That is, they got the consensus position as represented by the spaghetti diagram.

While this needs to be critiqued, I simply don't see it as a problematic paper or a challenging result. In fact, it seems ot me to vindicate what I understood to be the consensus.

The vehemence of some of the criticisms alarm me. Did they have to cite Wegman? I don't know. Does that invalidate their actual results? No. Is Eli's case solid? Sorry, but I am pretty much inclined to say it isn't. If I (and a few others) missed something, it would be good to see a clear explanation of what that is.

Martin Vermeer said...

Michael, yes, their reconstruction result is indeed consistent -- within its own substantial uncertainty -- with the familiar spaghetti graphs, and even with Mann 08.

Still I wouldn't let them off the hook so easily. They picked the solution with 10 PCs wihout as much as searching the space of these values for the optimum. Same with the choice of the number of instrumental PCs to regress against -- essentially, alternative ways of aggregating the detailed instrumental data.

These two numbers together form a space that needs to be searched for optimality, i.e., which values give the narrowest uncertainty bands. Any pair of values is "legal"; but the pair giving the sharpest backcast is the useful one. One of the nice things with Bayes is that you can do that. But they didn't and just picked one they liked.

Note that the backcast for local PCs = 1 and proxy PCs = 10 looks remarkably similar to Mann 08. I'd love to see the Bayesian posterior uncertainties of. I'd bet they are narrower than those of M&W's preferred recon.

Anonymous said...

Michael Tobis,
I very much disagree with your assessment. The global calibration is very important because you're bound to find the proxies to not perform as well under that condition. Every microsite has different factors which make it sensitive to local climatic conditions. Of course the proxies aren't good for global temps. There are so many processes which go on different from region to region and hemisphere to hemisphere. NH is more sensitive to the AMO so that means that you're gonna see big differences between the NH proxies compared to the "global calibration...

Anonymous said...

Angliss,

The posts you reference are about demonstrating variance loss in the simplest form. They don't require gridded temperatures to demonstrate the effect so people can understand it. Apparently, for non-math oriented it's still confusing.

In this case, the global comparison vs local will just shift the particular proxies chosen and weighted. It's a different method which I consider equally invalid to the originals. The important conclusions though were in the uncertainty of the reconstruction portion. Without a single mention of variance loss, how can we calculate the true uncertainty.

jeff id

David B. Benson said...

I'm reading Fan & Yao "Nonlinear Time Series" and just up to the parametric nonlinear models part.

Jim Crimmins said...

I think the shape of the "hockey stick" is so dependent on meta-assumptions, which are clearly not able to be impartially chosen, that we should all agree that there isn't much point in flogging this thing any further. My own conclusion is that it's pretty difficult to say that the proxies offer any information that shifts the backcast beyond then envelope of natural variability bounds (say +/-.1C/decade or +/-1C per 1000 yrs), from the start of the instrumental record. So there really isn't much info here. The prior would have be that things are a bit warmer today because of CO2, but it's not really provable from the proxies. There is certianly no strong, 95% confidence evidence here for "unprecedented" anything, whether that's absolute temperature or slope of the rise.

Michael Tobis said...

Jim, they claimed 80% confidence that the hottest decade began in 1997, which I think is worthy of note.

steven said...

Actually Eli The issue of calibration against the local grid is not as clear as you imagine.

In some cases researchers check for calibration against, the following
a. the seasonal signal
b. the annual signal
c, The local grid.
d. the neighboring grid
e. the hemisphere.
In some cases they actually recompute their own version of the temperature for that grid. I believe Rob Wilson's done that.

In some cases they argue that trees are teleconnected. shrugs.

Micheal T. That 80% probablity is CONDITIONED by several assumptions. a perfect model. error free data and stationarity. I'd say it was a coin toss by the time you figure everything in. Still, Our reason for concern is not the HS. Its what the basic physics tells us. We know human kind is the culprit. We know the weapon is C02. We have the finger prints on that weapon. We know the weapon Kills. and people want to waste their radical potential by arguing the HS? the HS is like hand writing analysis. not essential to the case. Not a mature science and the expert witness really doesnt have it all together.

basically anything to find the signal. All you have to do is find a couple that hit the mark you want and you get nice tight bars for all your attempts at calibration. Now, of course all this hunting for calibration should be penalized in the final certainty, but its not.

Martin Vermeer said...

If throwing away recoverable information counts as introducing a "meta-assumption", yes then the scope for doing so is pretty infinite. There aren't so many alternative ways of doing this right. M&W didn't find one.

enodo said...

Martin:

While your math is correct I think something's missing.

Suppose we have several proxy series, A, B, and C. We have some temperature data for the last 130 years against which we can calibrate them. That is, we need to know how good A, B, and C are at measuring local temperature, which we derive from the recent data for which we actually know the temperature.

1) I don't see how you've answered Eli's objection that you have lost the ability to calibrate your proxies. I think it's instructive to take the case Eli mentioned, namely, imagine that during the last 130 years the temperature globally didn't change at all. Therefore, in principle at least, we can learn nothing about how A, B, and C measure temperature, since there's only one y axis point, the (constant) global temperature. Therefore, as you try to extend back out of the calibration timeframe, you will have infinite error bars on the past global temperature, but Eli will have an actual measurement.

2) Also, what you are saying would be true if the *weights* of A, B, and C remained constant over time. But if there are different regions of the planet over time covered by A, say, as we extend into the past, then its weight in the hemispheric average will be changing, and so we need to know how it correlates with *local* temperatures, not merely how it works globally.

Rattus Norvegicus said...

Zorita weighs in on MW. He does not like what he finds.

LOL, CAPTCHA is deniest.

captdallas2 said...

Steve's progression for calibrating proxies makes perfectly good sense to me. Especially, with high frequency proxies that will have huge amounts of regional noise due to natural climate shifts like the AO shift that nearly froze me this winter in the Florida Keys and the current situation in Russia. On the other hand a low frequency proxy would seem more applicable to a hemispherical calibration.

Attempting to filter a high frequency proxy to create a low frequency reconstruction to find some magical teleconnection to global or hemispherical temperature records is a bit like playing with yourself. It may feel good but it doesn't accomplish much.