rss
JAMIA 2007;14:212-220 doi:10.1197/jamia.M2191
  • Original Investigation
  • Research paper

A Day in the Life of PubMed: Analysis of a Typical Day’s Query Log

  1. Jorge R Herskovica,
  2. Len Y Tanakaa,b,
  3. William Hershc,
  4. Elmer V Bernstama,d
  1. aUniversity of Texas School of Health Information Sciences at Houston, Houston, TX
  2. bDepartment of Pediatrics, Division of Pediatric Critical Care, University of Texas School of Medicine at Houston, Houston, TX
  3. cDepartment of Medical Informatics & Clinical Epidemiology, Oregon Health & Science University, Portland, OR
  4. dDepartment of Internal Medicine, Division of General Internal Medicine, University of Texas School of Medicine at Houston, Houston, TX
  1. Correspondence and reprints: Dr. Elmer V. Bernstam, University of Texas School of Health Information Sciences at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030; (e-mail: <Elmer.V.Bernstam{at}uth.tmc.edu>)
  • Received 1 July 2006
  • Accepted 6 December 2006

Abstract

Objective To characterize PubMed usage over a typical day and compare it to previous studies of user behavior on Web search engines.

Design We performed a lexical and semantic analysis of 2,689,166 queries issued on PubMed over 24 consecutive hours on a typical day.

Measurements We measured the number of queries, number of distinct users, queries per user, terms per query, common terms, Boolean operator use, common phrases, result set size, MeSH categories, used semantic measurements to group queries into sessions, and studied the addition and removal of terms from consecutive queries to gauge search strategies.

Results The size of the result sets from a sample of queries showed a bimodal distribution, with peaks at approximately 3 and 100 results, suggesting that a large group of queries was tightly focused and another was broad. Like Web search engine sessions, most PubMed sessions consisted of a single query. However, PubMed queries contained more terms.

Conclusion PubMed’s usage profile should be considered when educating users, building user interfaces, and developing future biomedical information retrieval systems.

Footnotes

  • Supported in part by a training fellowship from the W. M. Keck Foundation to the Gulf Coast Consortia through the Keck Center for Computational and Structural Biology, NLM grant 5K22LM008306 and NCRR grant 1UL1RR024148.

Access policy for JAMIA

All content published in JAMIA is deposited with PubMedCentral by the publisher but with varying embargo times. Authors/funders may pay an Unlocked fee of $2,000 to make the article free on the JAMIA website and PMC immediately on publication. Research funded by government and other recognised agencies is deposited with a 12 month embargo. All other content is deposited with a 36 month embargo.

AMIA members log in here to access the full text of JAMIA.

Register for free content

Individuals may register for a free 30 day online trial to all content.

The Journal of the American Medical Informatics Association is published for the American Medical Informatics Association by BMJ Publishing Group Ltd.