Wednesday, April 07, 2010

Muir Russell and the Wayback Machine

Eli has been thinking a bit about the Muir Russell Inquiry into the stolen Climate Research Unit Emails. One of the issues (if McIntyre and his ilk are about, when has it not been) is what sharing of data and methods is required by the act of publication. Now Eli is an OLD bunny. Maybe not quite so old that he wrote his thesis with a quill pen, but old enough that he did the drawings in India ink and lettered them with Leroy templates. He remembers when the copying machine was a rexograph, mimeographs being too expensive, and when you paid a couple of hundred bucks for the original and three carbon copies of your thesis. The original went to the university library, one copy to you, one copy to your adviser and the third to University Microfilms, who, well microfilmed it.

In those days you actually paid for reprints of your papers, because once the type was set, the offprints were cheaper than copying, and people actually sent you postcards from third world countries like England and Japan, begging for copies, which you mailed off, because you wanted to keep them coming to get the stamps to give to your kid brother, who was grateful for a nanosecond, and postage cost nothing or was paid for by the department.

There were NO data repositories, no hard discs, paper tape took a lot of room and magnetic tape was something you ooed and aahed about. This, as all things changed. As it changed journal requirements changed also. Today, Eli wandered into the depths of the library to look at some dead trees. Specifically Nature. Turns out that in 1996 Nature's requirements changed from

Nature requests authors to deposit sequence and crystallography data in the databases that exist for this purpose
those fields being the first to establish such data archives, to the current
Materials: As a condition of publication authors are required to make maerials and methods used freely available to academic researchers for thier own use. Supporting data sets must be made available on the publication date from the authors directly and by posting on Nature's web site, by depostion in the appropriate data base or on the internet.
Most other journals have much less stringent policies, and these too have changed over time. Data retention is another area where forever is no answer. In the US NIH policy is
Period of retention. Data should be retained for a reasonable period of time to allow other researchers to check results or to use the data for other purposes. There is, however, no common definition of a reasonable period of time. NIH generally requires that data be retained for 3 years following the submission of the final financial report. Some government programs require retention for up to 7 years. A few universities have adopted data-retention policies that set specific time periods in the same range, that is, between 3 and 7 years. Aside from these specific guidelines, however, there is no comprehensive rule for data retention or, when called for, data destruction.
Kings College (London) has a flow chart for the engineering bunnies where the recommendation is seven years for funded research and four for unfunded.

All this goes to the accusations against Phil Jones and the CRU for "destroying data". It's been clearly established that the CRU was never a data depository for data from the National Meteorological Services, but there has been plenty of noise that they had an obligation to plasticize every piece of paper in the place.

Nonsense. As Jones wrote:

No one, it seems, cares to read what we put up on the CRU web page. These people just make up motives for what we might or might not have done.

Almost all the data we have in the CRU archive is exactly the same as in the Global Historical Climatology Network (GHCN) archive used by the NOAA National Climatic Data Center [see here and here].

The original raw data are not “lost.” I could reconstruct what we had from U.S. Department of Energy reports we published in the mid-1980s. I would start with the GHCN data. I know that the effort would be a complete waste of time, though. I may get around to it some time. The documentation of what we’ve done is all in the literature.

If we have “lost” any data it is the following:

1. Station series for sites that in the 1980s we deemed then to be affected by either urban biases or by numerous site moves, that were either not correctable or not worth doing as there were other series in the region.

2. The original data for sites for which we made appropriate adjustments in the temperature data in the 1980s. We still have our adjusted data, of course, and these along with all other sites that didn’t need adjusting.

3. Since the 1980s as colleagues and National Meteorological Services (NMSs) have produced adjusted series for regions and or countries, then we replaced the data we had with the better series.

In the papers, I’ve always said that homogeneity adjustments are best produced by NMSs. A good example of this is the work by Lucie Vincent in Canada. Here we just replaced what data we had for the 200+ sites she sorted out.

The CRUTEM3 data for land look much like the GHCN and NASA Goddard Institute for Space Studies data for the same domains.

Apart from a figure in the IPCC Fourth Assessment Report (AR4) showing this, there is also this paper from Geophysical Research Letters in 2005 by Russ Vose et al. Figure 2 is similar to the AR4 plot.

I think if it hadn’t been this issue, the Competitive Enterprise Institute would have dreamt up something else!

Yes indeedy


Michael Tobis said...

Better late than never, Dr. Jones starts to understand the nature of the situation...

On an oddly related note, Eli, a reminder to please title your articles to facilitate their propagation.

EliRabett said...

eeep. Ely tries, he really tries, but the bunnies are always tickling his ears.

Horatio Algeranon said...

"I think if it hadn’t been this issue, the Competitive Enterprise Institute would have dreamt up something else!"

Like a miniature black hole that will "swallow up" the earth.

Hassett may be aboard the sister star-ship, the "American Enterprise Institute", but it's basically six fruitcakes and a half dozen banana nut breads.

Andrew Goreing said...

With respect to MT, I think Phil Jones woke up to the situation some time ago -- the comment Eli quotes was posted on October 13, 2009, a month before the UEA hack.

Of course Jones may feel a little powerless when statements such as "the original raw data are not 'lost'" mysteriously become "Phil Jones admits he hid the data" as they pass through sections of the blogosphere (and the press).

John W. Farley said...

The so-called "Climategate scandal" has been placed in proper perspective by Prof. Anthony DiMaggio at the online zine mrzine

DiMaggio understands that both Michael Mann and Phil Jones have been cleared after investigations. DiMaggio is a prof of politics, not a scientist, so he understands what is going on, which is, after all, politics.

The real scandal is that the mainstream media pays far too much attention to the skeptics. That scandal continues.

Angliss said...

Hey, some people actually like to eat authentic fruitcake, loaded with brandy. Comparing CEI and AEI to a fruitcake is insulting and just plain mean to the fruitcake.

Sou said...

I've written a short item on how Monbiot has said that this article in the Guardian is probably his last post on the matter of the stolen emails.