Tuesday, August 28, 2007

Why librarians want to kill libertarians

There is a rumor that information wants to be free, and the open access movement is gathering steam. The mantra is that the results of publicly funded research should be free to all. Tim Lambert touched this off by pointing to a particularly stupid screed against open access. This produced the usual posturing from the anything I can grab is mine libertarian crowd.

Eli, a consumer of carrots and science papers, and rather cranky at the moment, has no strong opinion in the abstract. On the other hand he knows enough librarian avengers to understand that the base issue is who guarantees that the journals will be available forever.

Electronic databases of journal articles are fundamentally different from print. Print archives are found at multiple libraries, each of which has an institutional obligation to maintain them in the face of declining budgets and space (ever wonder why librarians love microfiche?). For a significant journal, such as JACS, Phys. Rev., etc., there will be many hundreds of complete archives up to the point where the libraries stop taking the printed editions. True you can search the earth for the Journal of the Montana Academy of Science, but inter-library loans can even get you that.

Electronic databases of journal articles reside in one place and are distributed on the Web.

What guarantees that the archive will exist 50 years from today? If Elsevier closes Science Direct down because it can't make the money it wants on it libraries that have switched over to electronic access only wake up with huge holes in their collections. In some cases where the material only exists as electrons everyone has lots of nothing. Librarians are all too aware of these issues which is why they are not greeting the open access movement with huzzahs. Before open access can actually happen, the archive issue has to be satisfied, which means that someone has to come up with the money to fund the archive in perpetuity.

This is not a problem that will be solved by volunteers nor is it one to be solved on the cheap

18 comments:

James Annan said...

"Electronic databases of journal articles reside in one place and are distributed on the Web."

Journal articles reside all over the place these days. Open access would facilitate more distributed and less vulnerable storage.

Google is my library these days, and it's a frustration when I come across a ref to something that is not available on-line. I do realise it could all come crashing down one day, but there is no question that it's much more efficient than paper, while it works!

"What guarantees that the archive will exist 50 years from today?"

OTOH, how much 50y-old science actually matters today? I bet most citations of such material are honorary ones where the original paper was not even read. Something theoretically available in some stack somewhere isn't much use to me if I do not know about it and cannot search efficiently for it.

EliRabett said...

James, the librarians will come for you. The point of an archive is that a. it is as complete as possible and b. it will exist into the future as far as possible. Consider the damage that was done through the destruction of the libraries of the ancient world, knowledge that had to be rediscovered over centuries.

A limitations on "old papers" is that they disappear from memory and are continually redone. Chemistry organized early on to avoid this problem with a number of >100 year old databases Chemical Abstracts, Gmelin and Beilstein. Because we can look thing up old things(esp with these databases on line today and a subscription costs a fortune), older papers are cited, but as importantly they are used. For example, I sometimes need to do a synthesis which was first described ~100 years ago. Synthetic chemists today have moved well beyond the type of molecules I am interested in. I can look that up in CA and Beilstein.

And yes, we sometimes put our journal articles on our web sites, but it is all helter skelter. Chaos is not an attractive option for journal articles.

John Mashey said...

For perspective & amusement (occasionally needed since this topic is a tough problem), I recommend:

1) Hal Draper, MS FND IN A LBRY, 1961:
en.wikipedia.org/wiki/Ms_Fnd_in_a_Lbry,
with actual text at:
home.comcast.net/~bcleere/texts/draper.html

2) Vernor Vinge, "Rainbow's End" includes an amusing battle amongst those who munch and digitize books, and two groups with different visions of what a library should be.

bigcitylib said...

Since the contents of several years of most journals can fit on a single CDROM, I don't see why maintaining a hard copy archive of electronic format materials should be that difficult.

Anonymous said...

The point of an archive is that a. it is as complete as possible and b. it will exist into the future as far as possible."

While "perfection into perpetuity" may be something to shoot for, that's all it is.

It ain't possible. No library is "complete" and no library will exist forever.

Some come closer than others to completion, but all are useful.

The way it works now is that if you can't find it at a particular journal or book, you get it through interlibrary loan.

There is no reason an archive on the internet could not (and does not) work the same way.

EliRabett said...

CDs are an interesting issue. Who says you are going to have access to a CD reader 50 years from now? Who says the data on the CD will not be corrupted. Then there is software rot.

Eli has a friend who was working on determining lifetimes of R/W CDs which depended on the dyes they were using. The news was not good.

Last Friday, he talked with someone who had recently gone back to find some files on an old hard disk and found multiple errors and worse.

The same thing has happened with 8" floppies, reel to reel tapes, MFM hard drives, acid paper, etc. To be a librarian is to worry. One of the strengths of microfilm is that it has a very long lifetime, but even today finding a reader can be interesting.

EliRabett said...

Anon 6:43 the problem is that with on line journals there often IS only one place and if that place goes away, at best you are at the mercy of Google to find another copy.

Anonymous said...

with on line journals there often IS only one place and if that place goes away, at best you are at the mercy of Google to find another copy."


... precisely because the journals have refused to make their data available.

Actually, I think the concern about "online archives" will become largely moot because I think journals are going to be forced to open up their databases in order to attract people back to their sites, as scientists post copies of their articles online. This is already happening.

It's a little like the case of the music industry, who tried to keep absloute control over the music. When it became clear that this would not work in the internet age (because of the widespread sharing), they decided they had better find a way to play the game -- or lose out entirely.

I predict the same thing will happen with the journals. If few people have access to a journal (only those who pay), I'd argue that it is not much use to the scientific community anyway.

So let it go under. Someone else will pick up the slack who is interested in getting the information out there. There are lots of ways to make money other than charging for access.

John Mashey said...

re: disks, tapes & such

I'm a Trustee at the Computer History Museum. Our folks continuously need to recover data from older physical media, and it isn't easy, because:
- the media must be physically readable
- one must have the physical reader
- one must have software that can interpret the data, which is sometimes not so easy.

Higher volumes (i.e., CDs, DVDs) make that easier, but recall that 7-track tapes were introduced in 1952, used through mid-1960s, and conversion services still exist for these, but it's getting harder, and as for dumps of binary data from the 1950s on 36-bit machines of which none have existed for years... ugh. At least there are still paper tape readers running on ou IBM 1620 and PDP-1 (space war).

Joseph said...

This produced the usual posturing from the anything I can grab is mine libertarian crowd.

Example?

As far as I can tell, libertarians have almost completely ignored this.

The people who are criticizing Open Access have very strong disagreements with libertarian ideology. We are dealing with protectionists and opponents of immigration.

Anonymous said...

"The people who are criticizing Open Access have very strong disagreements with libertarian ideology."

Perhaps in some regards, they do, but in one regard, arguably the most important, they agree:

"Anything that is done by the government would be better done by private individuals."

The argument against open access is not just -- or even primarily -- about "open access".

It is about how open access by the government "unfairly" competes with private industry and also "distorts the discussion" by (also "unfairly") determining where the research emphasis should lie).

The latter are common libertarian gripes.

Anonymous said...

"If Elsevier closes Science Direct down because it can't make the money it wants on it libraries that have switched over to electronic access only wake up with huge holes in their collections."

Yup, but there's a real attraction to Elsevier, Kluwer, etc. to having electronic delivery of papers. It locks in customers, who otherwise would unsubscribe from journals during periods of budget cuts. I'd predict the Elsevier would find the journal business lucrative for a very long time.

Marion Delgado said...

Eli, the above plus nothing is much cheaper than data storage. Retrieval is another issue but if the need is pressing .... Seriously, CDs sort of suck they deteriorate with time but not so much magnetic media.

And I think most librarians do embrace open access. I don't know most librarians, it turns out, but I do know some. And my small sample of a few is 100% in favor.

This is a post BADLY in need of update based on comments, yo.

Anonymous said...

The real question is this:

Do you know "Madam Librarian", Marion?

Marion Delgado said...

By the way, Head Rabbit, why is your RSS feed weird? Everything is called Untitled Item #xxx.

Is this some kind of scheme to deny us access via RSS? Do we have to use a script? Filthy surface-station misinterpreting climate faithists!

David Brake said...

The need for a perpetual archive is a valid one but I would suggest that open access online journals are far more likely to stay available indefinitely thanks to projects like the Internet Archive than journals run by commercial publishers who might go bust or lose interest (after all the oldest stuff that is rarely accessed is presumably not profitable). And the costs of serving and archiving academic journals are not large - most of it is just plaintext after all (excepting astronomy and some of the other hard science stuff).

And of course there's nothing to prevent worried librarians from simply printing out web pages and storing the printouts if they are worried that one day we'll forget how to display HTML.

EliRabett said...

of course the printouts would have to be on acid free paper. . .

Seriously tho this is something that should be organized on an international level.

Matt Hodgkinson said...

BioMed Central deposits the open access articles that we publish in multiple digital archives around the world to guarantee long-term digital preservation.

These archives include:
INIST (France)
Koninklijke Bibliotheek (The Netherlands)
Potsdam University (Germany)
PubMed Central (United States)
UK PubMed Central (UK)

We are also participating in the British Library's e-journals pilot project, and plan to deposit copies of all articles with the British Library.

BioMed Central is a participant in the LOCKSS (Lots of Copies Keep Stuff Safe) initiative. LOCKSS will enable any library to maintain their own archive of content from BioMed Central and other publishers, with minimal technical effort and using cheaply available hardware.