rss
JAMIA 2006;13:676-690 doi:10.1197/jamia.M2036
  • Original Investigation
  • Research Paper

Auditing as Part of the Terminology Design Life Cycle

  1. Hua Min,
  2. Yehoshua Perl,
  3. Yan Chen,
  4. Michael Halper,
  5. James Geller,
  6. Yue Wang
  1. Affiliations of the authors: Computer Science Department, New Jersey Institute of Technology, Newark, NJ (HM, YP, YC, JG, YW); Mathematics and Computer Science Department, Kean University, Union, NJ (MH); Department of Computer Information Systems, BMCC, The City University of New York, New York, NY (YC); Medical Information Systems Unit, Boston University School of Medicine, Boston Medical Center, Boston, MA (HM)
  1. Correspondence and reprints: Yehoshua Perl, Computer Science Department, NJIT, University Heights, Newark, NJ 07102; email: <perl{at}oak.njit.edu>
  • Received 16 December 2005
  • Accepted 16 July 2006

Abstract

Objective To develop and test an auditing methodology for detecting errors in medical terminologies satisfying systematic inheritance. This methodology is based on various abstraction taxonomies that provide high-level views of a terminology and highlight potentially erroneous concepts.

Design Our auditing methodology is based on dividing concepts of a terminology into smaller, more manageable units. First, we divide the terminology’s concepts into areas according to their relationships/roles. Then each multi-rooted area is further divided into partial-areas (p-areas) that are singly-rooted. Each p-area contains a set of structurally and semantically uniform concepts. Two kinds of abstraction networks, called the area taxonomy and p-area taxonomy, are derived. These taxonomies form the basis for the auditing approach. Taxonomies tend to highlight potentially erroneous concepts in areas and p-areas. Human reviewers can focus their auditing efforts on the limited number of problematic concepts following two hypotheses on the probable concentration of errors.

Results A sample of the area taxonomy and p-area taxonomy for the Biological Process (BP) hierarchy of the National Cancer Institute Thesaurus (NCIT) was derived from the application of our methodology to its concepts. These views led to the detection of a number of different kinds of errors that are reported, and to confirmation of the hypotheses on error concentration in this hierarchy.

Conclusion Our auditing methodology based on area and p-area taxonomies is an efficient tool for detecting errors in terminologies satisfying systematic inheritance of roles, and thus facilitates their maintenance. This methodology concentrates a domain expert’s manual review on portions of the concepts with a high likelihood of errors.

Footnotes

  • * A capitalized italic font is used for concepts. Role names will be italicized and start with a lowercase letter.

  • Corrected by Nicole Thomas, an NCIT editor, following our report.

  • †† Nicole Thomas, personal communication.

Access policy for JAMIA

All content published in JAMIA is deposited with PubMedCentral by the publisher but with varying embargo times. Authors/funders may pay an Unlocked fee of $2,000 to make the article free on the JAMIA website and PMC immediately on publication. Research funded by government and other recognised agencies is deposited with a 12 month embargo. All other content is deposited with a 36 month embargo.

The Journal of the American Medical Informatics Association is published for the American Medical Informatics Association by BMJ Publishing Group Ltd.