rss
JAMIA 2003;10:494-503 doi:10.1197/jamia.M1330
  • Original Investigation
  • Research Paper

Creating a Text Classifier to Detect Radiology Reports Describing Mediastinal Findings Associated with Inhalational Anthrax and Other Disorders

  1. Wendy Webber Chapman, PhD,
  2. Gregory F Cooper, MD, PhD,
  3. Paul Hanbury, BS,
  4. Brian E Chapman, PhD,
  5. Lee H Harrison, MD,
  6. Michael M Wagner, MD, PhD
  1. Affiliations of the authors: Center for Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania (WWC, GFC, PH, MMW); RODS Laboratory, University of Pittsburgh, Pittsburgh, Pennsylvania (WWC, GFC, MMW); Department of Radiology, University of Pittsburgh, Pittsburgh, Pennsylvania (BEC); Infectious Diseases Epidemiology Research Unit, Department of Medicine and Department of Epidemiology, University of Pittsburgh, Pittsburgh, Pennsylvania (LHH)
  1. Correspondence and reprints: Wendy W. Chapman, PhD, Center for Biomedical Informatics, University of Pittsburgh, Suite 8084 Forbes Tower, Pittsburgh, PA 15213; e-mail: <chapman{at}cbmi.upmc.edu>.
  • Received 21 January 2003
  • Accepted 13 May 2003

Abstract

Objective The aim of this study was to create a classifier for automatic detection of chest radiograph reports consistent with the mediastinal findings of inhalational anthrax.

Design The authors used the Identify Patient Sets (IPS) system to create a key word classifier for detecting reports describing mediastinal findings consistent with anthrax and compared their performances on a test set of 79,032 chest radiograph reports.

Measurements Area under the ROC curve was the main outcome measure of the IPS classifier. Sensitivity and specificity of an initial IPS model were calculated based on an existing key word search and were compared against a Boolean version of the IPS classifier.

Results The IPS classifier received an area under the ROC curve of 0.677 (90% CI = 0.628 to 0.772) with a specificity of 0.99 and maximum sensitivity of 0.35. The initial IPS model attained a specificity of 1.0 and a sensitivity of 0.04.

Conclusion The IPS system is a useful tool for helping domain experts create a statistical key word classifier for textual reports that is a potentially useful component in surveillance of radiographic findings suspicious for anthrax.

Footnotes

  • This work was supported in part by NLM training grant T15 LM07059 and CDC grant UPO/CCU 318753-02. The authors thank the physicians who read and classified the reports: John Dowling, Jeremy Espino, Jim Hinderup, Kim Mast, and Stylianos Kakoullis. The authors also thank Zhongwei Lu for his assistance in collecting reports and Melissa Saul for her advice.

  • * Initially, WWC and LHH classified reports together. After classifying a few hundred reports together, WWC classified the reports alone, consulting LHH when necessary.

  • Formula,where TP = 616, FN = 9, SF = 124.65.

Access policy for JAMIA

All content published in JAMIA is deposited with PubMedCentral by the publisher but with varying embargo times. Authors/funders may pay an Unlocked fee of $2,000 to make the article free on the JAMIA website and PMC immediately on publication. Research funded by government and other recognised agencies is deposited with a 12 month embargo. All other content is deposited with a 36 month embargo.

The Journal of the American Medical Informatics Association is published for the American Medical Informatics Association by BMJ Publishing Group Ltd.