rss
J Am Med Inform Assoc 2009;16:690-704 doi:10.1197/jamia.M3162
  • Original Investigation
  • Research Paper

Cross-Topic Learning for Work Prioritization in Systematic Review Creation and Update

  1. Aaron M Cohen, MD, MS,
  2. Kyle Ambert,
  3. Marian McDonagh, PharmD
  1. Department of Medical Informatics and Clinical Epidemiology, School of Medicine, Oregon Health & Science University, Portland, OR
  1. Correspondence: Aaron M. Cohen, Department of Medical Informatics, Clinical Epidemiology, School of Medicine, Oregon Health & Science University, 3181 S. W. Sam Jackson Park Road, Mail Code: BICC, Portland, OR 97239-3098 (Email: cohenaa{at}ohsu.edu).
  • Received 2 February 2009
  • Accepted 29 May 2009

Abstract

Objective Machine learning systems can be an aid to experts performing systematic reviews (SRs) by automatically ranking journal articles for work-prioritization. This work investigates whether a topic-specific automated document ranking system for SRs can be improved using a hybrid approach, combining topic-specific training data with data from other SR topics.

Design A test collection was built using annotated reference files from 24 systematic drug class reviews. A support vector machine learning algorithm was evaluated with cross-validation, using seven different fractions of topic-specific training data in combination with samples from the other 23 topics. This approach was compared to both a baseline system, which used only topic-specific training data, and to a system using only the nontopic data sampled from the remaining topics.

Measurements Mean area under the receiver-operating curve (AUC) was used as the measure of comparison.

Results On average, the hybrid system improved mean AUC over the baseline system by 20%, when topic-specific training data were scarce. The system performed significantly better than the baseline system at all levels of topic-specific training data. In addition, the system performed better than the nontopic system at all but the two smallest fractions of topic specific training data, and no worse than the nontopic system with these smallest amounts of topic specific training data.

Conclusions Automated literature prioritization could be helpful in assisting experts to organize their time when performing systematic reviews. Future work will focus on extending the algorithm to use additional sources of topic-specific data, and on embedding the algorithm in an interactive system available to systematic reviewers during the literature review process.

Footnotes

  • This work was supported by grant 1R01LM009501-01 from the National Library of Medicine.

Access policy for JAMIA

All content published in JAMIA is deposited with PubMed Central by the publisher with a 12 month embargo. Authors/funders may pay an Unlocked fee of $2,000 to make the article free on the JAMIA website and PMC immediately on publication.

All content older than 12 months is freely available on this website.

AMIA members can log in with their JAMIA user name (email address) and password or via the AMIA website.