Wednesday, January 08, 2014

Search Engines - Google Scholar

Eli forgets where, but he got trapped in a discussion of search engines, citation analysis and open source publishing.  While Eli is a smart bunny, he don't know it all, but he knows some, probably more about these things, so what follows is the first of some brief summaries for beginners.  Hopefully this will eventually cover Googele Scholar, SciFinder, WebofScience, Scopus, INSPEC, ProQuest.  Anybunny who wants to take a chance on any of these or others, feel free.  Talking about free, the place to start is obviously:

Google Scholar:  Google scholar is a free offering from Google that can be accessed at  It's strength is that it will find (pretty much) anything on the net, journals, books, conference proceedings, etc.  The disadvantage is that it misses (pretty much) everything that is not on the net, so it will be weaker, the further back in time that you go.  Google scholar is catholic, it searches across all fields.

A typical entry reads

Phylogeny and ancient DNA of Sus provides insights into neolithic expansion in Island Southeast Asia and Oceania

…, LM Pedriña, PJ Piper, RJ Rabett… - Proceedings of the …, 2007 - National Acad Sciences
Abstract Human settlement of Oceania marked the culmination of a global colonization
process that began when humans first left Africa at least 90,000 years ago. The precise
origins and dispersal routes of the Austronesian peoples and the associated Lapita ...
Cited by 124 Related articles All 23 versions Cite Save

If you are lucky there is something off to the side like [HTML] from which will take a bunny directly to an open source for the article

The Cited by 124 leads to 124 other entries which cite the article you found.  This is the citation time machine.  It takes you to articles on (vaguely) the same topic, but published after the one you are looking at.

Related Articles are ones that have been cited by the original article or that Google thinks should have been cited by the original article, or have appeared later and would have been cited according to Google.  Again a help when one is researching a topic.

The All 23 versions link brings you to a page which lists all other pages where either the original article can be found (often behind paywalls).  For a paywalled article this is a good place to shop for an open version, but most of the links are to collections of abstracts which can be frustrating especially if the abstract collection links back to the paywall.

Cite brings up a pop up

Copy and paste a formatted citation or use one of the links to import into a bibliography manager.
MLA  Larson, Greger, et al. "Phylogeny and ancient DNA of Sus provides insights into neolithic expansion in Island Southeast Asia and Oceania." Proceedings of the National Academy of Sciences 104.12 (2007): 4834-4839.
APA  Larson, G., Cucchi, T., Fujita, M., Matisoo-Smith, E., Robins, J., Anderson, A., ... & Dobney, K. (2007). Phylogeny and ancient DNA of Sus provides insights into neolithic expansion in Island Southeast Asia and Oceania. Proceedings of the National Academy of Sciences, 104(12), 4834-4839.
Chicago  Larson, Greger, Thomas Cucchi, Masakatsu Fujita, Elizabeth Matisoo-Smith, Judith Robins, Atholl Anderson, Barry Rolett et al. "Phylogeny and ancient DNA of Sus provides insights into neolithic expansion in Island Southeast Asia and Oceania." Proceedings of the National Academy of Sciences 104, no. 12 (2007): 4834-4839.
If anybunny is serious about science they need a reference manager, something that allows organization of references, insertion into documents and general all around avoidance of aggro.  BibTeX grew our of LaTeX, something Eli avoids with a passion.   If the Bunny wanted to be a printer he would have gone into the family business.  However, it is free and there are now interfaces to Word and OpenOffice.

EndNote, the one Eli uses, is sold by Thompson-Reuters at a huge markup, $250, but at ~half price to  students and others associated with universities, $113.  There is a web based version.

RefMan is another Thompson Reuters product, costs are about the same.  Eli knows nothing about it

RefWorks is web based.  They sell annual licenses to individuals ($70) and organizations.  It is surprisingly hard to find out where you can get a license.

Saved offers you a place to save a reference you are interested in.  It is possible to label categories of papers so that the database is not flat.

Google Scholar also has an interesting front end, Ann Harzing's Publish or Perish more oriented towards citation analysis than searching, but none the less very useful for searching Google Scholar, especially for work by a particular person.  Publish or Perish has an excellent page on the meaning of various indicies starting with the original h-index
Proposed by J.E. Hirsch in his paper An index to quantify an individual's scientific research output, arXiv:physics/0508025 v5 29 Sep 2005. It aims to provide a robust single-number metric of an academic's impact, combining quality with quantity.
The h number is the number of publications, h, that a bunny has which have h citations.  Over 30 is good, over 50 is superbunny.  Depends on field of course and, because of the different coverage of different search engines, depends on the search engine.  The secret sin of academics is tracking their h number and those of the ones they hate.

Oh yeah Google Scholar also does a citation analysis, but only for yourself, which can be made public or not.


Anonymous said...

Google Scholar also offers to export your citation to bibtex (free bibliography fun for Latexers), though this oftentimes leads to some strange citations. I typically use Scholar for the search, and grab the correct citation from


Martin Vermeer said...


Martin Vermeer said...

I hear Mendeley is good too

Anonymous said...

Rib Smokin' bunny haz a sad, his h-index is only 21.

And Then There's Physics said...

How can you not like LaTeX, it's so much easier than anything else - assuming you have equations in your paper, I guess.

An advantage of Google Scholar seems to be that even if it misses some citations, it ends up giving your papers more than most other databases.

Jim Prall said...

There are three web-centric citation managers worth a look: - stand-alone application with web tools, user accounts for cloud storage browser plug-in with user accounts for cloud storage of ref data. Has special magic to gather biblio data from any hit on library websites of majors incl. many university libraries. - document manager/enhanced PDF reader (figures out where footnotes point, etc.) with citation management features too.
Zotero (and Mendeley?) support collaboration with other users to share a citation database.

Jim Prall said...

Science Magazine has this on how Google Scholar can learn your research interests and refine suggested new articles to read, as well as the importance its citation metrics are taking on and whether they can easily be gamed. It also notes the new rival service Microsoft has started, as well as the intriguing "Publish or Perish" tool for citation analysis.

EliRabett said...

Jim, Martin - care to write up something on using these reference managers. - Eli

EliRabett said...

attp - as Eli said, if he wanted to be a printer he would have been a printer. The time cost of LaTeX eliminates any advantages, and having come from the .runoff generation Eli really appreciates a WYSIWYG word processor.

Paul said...

Waaaaay back in grad school, I started using a reference manager called "Notebook", later renamed to "Notebuilder", both published by a California-based firm called "ProTem". The last update that I have was in 1994, and ProTem seems to have vanished shortly after that.

But the software still did what I needed, and I continued using it. However, Notebuilder doesn't seem to work with anything newer than Windows Vista, and I'll soon be replacing my one machine that still runs it.

I've got a big enough investment in my database (>10,000 items) that I can't afford to start over with something new. And I'm close enough to retirement that this would be silly anyhow. Perhaps the best solution would be to find a way of exporting my existing database to a newer program.

Any suggestions on what options might be out there?

And Then There's Physics said...

Eli, yes, LaTex does have it's complications :-) I've used it for a long time so am very used to it, and quite like it.

I, however, made the mistake this year of offering to convert a colleagues exam questions from Word into LaTeX (yes, we use LaTeX for out undergraduate exams). His questions weren't going to have many equations, so it should have been easy. I didn't count on the figures, which were all in the wrong format. Took all afternoon, rather than just half an hour or so :-)

rdbrown said...

For Paul:
A google search for "protem notebuilder linux"

finds 3 results for Biblioscape 7, 8, and 9 which gives instructions for
exporting from NoteBuild as Text for import into Biblioscape.

You should be able to create a copy of an XP environment as a virtual machine to keep NoteBuilder. Running it under Wine on Linux in a virtual machine could be another alternative

Martin Vermeer said...

Eli, that would basically be a set of links to good tutorials. And I'm not actually using either though I have taken Jabref (and Zotero) for a spin... it's raw bibTeX for me, with emacs + bibtex-mode. Don't ask a nerd to teach muggles.

BTW using the LaTeX/LyX tool chain is a sunk investment for me, giving me typographic excellence for free. But, appreciating that enough is a personal choice, like it is for great sex ;-)

Marion Delgado said...

I've used Zotero and evangelized it to all my friends still in academia.

carrot eater said...

I hate google scholar's interface (web of science just seems easier to use for me). I don't like that clicking on the result takes you straight to the journal's website. Maybe others like that, but I don't.

Russell Seitz said...

How doe the H-h metric compare with the Academia and Researchgate impact and citation metrics ?

And Then There's Physics said...

Eli, are you aware of the Riq index? I know very little about it, but was told that it somehow compensates for age (by which I mean research age, rather than actual age). One issue with the h-index is that it typically rises with time. The Riq index is meant to somehow compensate for that, but I haven't had a chance to work out how. Some/most databases don't seem to actually include it.

EliRabett said...

Took a look at the arXiv article. Strikes Eli as too much work for the benefit. The major problem is that it depends on finding an average number of cites for a particular sub-field (in the test case astronomy) and, besides being non-obvious as to how to do this, it is subject to gaming. The various divide by the number of co-authors strategies of course fails on HEP papers which might have a couple of hundred co-authors.

And Then Theres Physics said...

Eli, thanks. I'll have a look at it myself, but you're probably right. For someone like myself though, dividing by the number of co-authors is a good thing :-)

Trakar said...

The older version of Google Scholar's advanced search engine was marvelous so long as you had some skill at using search engines. Unfortunately this was removed 1-2 years ago, and there is no advanced search feature any longer. The neutered version currently available is, I guess, better than nothing, but only barely. I generally stick to university access, but this isn't always convenient and I actually prefer to use advanced google search to Google scholar in such situations as it allows more precise searches.

EliRabett said...

It is somewhat backwards, but the advanced search is buried. First you have to be logger in to Google something. Then the search box in the middle of the page will have a downward pointing triangle. Mouse over that and click to get to the advanced search.

Another way is to use Harzing Publish or Perish (freeware) as a front end.