rss
JAMIA 2009;16:109-115 doi:10.1197/jamia.M2950
  • Original Investigation
  • Research Paper

Machine Learning and Rule-based Approaches to Assertion Classification

  1. Özlem Uzunera,b,c,
  2. Xiaoran Zhangb,
  3. Tawanda Sibandab
  1. aInformation Studies, State University of New York, Albany, NY
  2. bMIT CSAIL, Cambridge, MA
  3. cComputer Engineering, Middle East Technical University, Northern Cyprus Campus, Kalkanli, Guzelyurt, Cyprus
  1. Correspondence: Özlem Uzuner Draper 114A, 135 Western Ave, Albany NY 12222; e-mail: <ouzuner{at}albany.edu>
  • Received 6 August 2008
  • Accepted 28 September 2008

Abstract

Objectives The authors study two approaches to assertion classification. One of these approaches, Extended NegEx (ENegEx), extends the rule-based NegEx algorithm to cover alter-association assertions; the other, Statistical Assertion Classifier (StAC), presents a machine learning solution to assertion classification.

Design For each mention of each medical problem, both approaches determine whether the problem, as asserted by the context of that mention, is present, absent, or uncertain in the patient, or associated with someone other than the patient. The authors use these two systems to (1) extend negation and uncertainty extraction to recognition of alter-association assertions, (2) determine the contribution of lexical and syntactic context to assertion classification, and (3) test if a machine learning approach to assertion classification can be as generally applicable and useful as its rule-based counterparts.

Measurements The authors evaluated assertion classification approaches with precision, recall, and F-measure.

Results The ENegEx algorithm is a general algorithm that can be directly applied to new corpora. Despite being based on machine learning, StAC can also be applied out-of-the-box to new corpora and achieve similar generality.

Conclusion The StAC models that are developed on discharge summaries can be successfully applied to radiology reports. These models benefit the most from words found in the ± 4 word window of the target and can outperform ENegEx.

Footnotes

    Access policy for JAMIA

    All content published in JAMIA is deposited with PubMedCentral by the publisher but with varying embargo times. Authors/funders may pay an Unlocked fee of $2,000 to make the article free on the JAMIA website and PMC immediately on publication. Research funded by government and other recognised agencies is deposited with a 12 month embargo. All other content is deposited with a 36 month embargo.

    AMIA members log in here to access the full text of JAMIA.

    Register for free content

    Individuals may register for a free 30 day online trial to all content.

    The Journal of the American Medical Informatics Association is published for the American Medical Informatics Association by BMJ Publishing Group Ltd.