rss
JAMIA 2005;12:486-494 doi:10.1197/jamia.M1767
  • Original Investigation
  • Research Paper

Integrating SNOMED CT into the UMLS: An Exploration of Different Views of Synonymy and Quality of Editing

  1. Kin Wah Fung,
  2. William T Hole,
  3. Stuart J Nelson,
  4. Suresh Srinivasan,
  5. Tammy Powell,
  6. Laura Roth
  1. Affiliations of the authors: National Library of Medicine, Bethesda, MD (KWF, WTH, SJN, TP); and MSD Inc., Vienna, VA (SS, LR)
  1. Correspondence and reprints: Kin Wah Fung, MD, Building 38A, Room 9N904, MS54, National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894; e-mail: <kwfung{at}nlm.nih.gov>
  • Received 10 December 2004
  • Accepted 15 February 2005

Abstract

Objective The integration of SNOMED CT into the Unified Medical Language System (UMLS) involved the alignment of two views of synonymy that were different because the two vocabulary systems have different intended purposes and editing principles. The UMLS is organized according to one view of synonymy, but its structure also represents all the individual views of synonymy present in its source vocabularies. Despite progress in knowledge-based automation of development and maintenance of vocabularies, manual curation is still the main method of determining synonymy. The aim of this study was to investigate the quality of human judgment of synonymy.

Design Sixty pairs of potentially controversial SNOMED CT synonyms were reviewed by 11 domain vocabulary experts (six UMLS editors and five noneditors), and scores were assigned according to the degree of synonymy.

Measurements The synonymy scores of each subject were compared to the gold standard (the overall mean synonymy score of all subjects) to assess accuracy. Agreement between UMLS editors and noneditors was measured by comparing the mean synonymy scores of editors to noneditors.

Results Average accuracy was 71% for UMLS editors and 75% for noneditors (difference not statistically significant). Mean scores of editors and noneditors showed significant positive correlation (Spearman's rank correlation coefficient 0.654, two-tailed p < 0.01) with a concurrence rate of 75% and an interrater agreement kappa of 0.43.

Conclusion The accuracy in the judgment of synonymy was comparable for UMLS editors and nonediting domain experts. There was reasonable agreement between the two groups.

Footnotes

  • Supported in part by an appointment to the NLM Research Participation Program sponsored by the National Library of Medicine and administered by the Oak Ridge Institute for Science and Education.

  • The authors thank Betsy Humphreys, Olivier Bodenreider, and James Cimino for their advice and suggestions in the preparation of the manuscript. They also thank Olivier Bodenreider, James Cimino, Alexander Yu, and the UMLS clinical editors for completing the synonymy questionnaire.

Access policy for JAMIA

All content published in JAMIA is deposited with PubMedCentral by the publisher but with varying embargo times. Authors/funders may pay an Unlocked fee of $2,000 to make the article free on the JAMIA website and PMC immediately on publication. Research funded by government and other recognised agencies is deposited with a 12 month embargo. All other content is deposited with a 36 month embargo.

The Journal of the American Medical Informatics Association is published for the American Medical Informatics Association by BMJ Publishing Group Ltd.