Life as Clay

How to query PubMed and retrieve XML results


This seems like it should be a simple thing to figure out how to do on the PubMed website — return a single result as XML. If you visit PubMed, there is a “Display” link at the top left that allow you to view an entry in XML format, but when you select that option, the URL changes to the vanilla PubMed URL. Anyhow, let’s say you go to PubMed and look at a particular article:

http://www.ncbi.nlm.nih.gov/pubmed/20598978

You can retrieve the same result in XML by simply going to this URL:

http://www.ncbi.nlm.nih.gov/pubmed/20598978?report=XML

Other formats are available and outlined on the PubMed help site. The help files on the site are not particularly easy to search, so it took me forever to find this info.

When you use the URL above, the page is designed to create XML that displays properly in the browser — not XML parsing by Ruby or another language. For that, you have to use the eutils. The proper link for returning XML that you can parse with nokogiri or another gem is:

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=20598978&retmode=xml

Advertisements

Written by Clay

July 6, 2010 at 13:07

Posted in Public Health, Technology

Tagged with ,

%d bloggers like this: