Phase II Evaluation of Clinical Coding Schemes
Completeness, Taxonomy, Mapping, Definitions, and Clarity
- James R Campbell,
- Paul Carpenter,
- Charles Sneiderman,
- Simon Cohn,
- Christopher G Chute,
- Judith Warren CPRI Work Group on Codes and Structures
- Affiliations of the authors: Departments of Internal Medicine and Information Technology Services (JRC); College of Nursing and University Hospital (JW); University of Nebraska Medical Center, Omaha NE; Section of Medical Information Resources, Mayo Foundation, Rochester, MN (PC, CGC); Cognitive Science Branch, National Library of Medicine, Bethesda, MD (CS); Kaiser Permanente Medical Care Program, Oakland CA (SC)
- Correspondence and reprints: James R. Campbell MD, Department of Internal Medicine, University of Nebraska MC, 600 South 42nd Street, Omaha, NE 68198-3331. E-mail:
- Received 1 November 1996
- Accepted 21 January 1997
Objective To compare three potential sources of controlled clinical terminology (READ codes version 3.1, SNOMED International, and Unified Medical Language System (UMLS) version 1.6) relative to attributes of completeness, clinical taxonomy, administrative mapping, term definitions and clarity (duplicate coding rate).
Methods The authors assembled 1929 source concept records from a variety of clinical information taken from four medical centers across the United States. The source data included medical as well as ample nursing terminology. The source records were coded in each scheme by an investigator and checked by the coding scheme owner. The codings were then scored by an independent panel of clinicians for acceptability. Codes were checked for definitions provided with the scheme. Codes for a random sample of source records were analyzed by an investigator for “parent” and “child” codes within the scheme. Parent and child pairs were scored by an independent panel of medical informatics specialists for clinical acceptability. Administrative and billing code mapping from the published scheme were reviewed for all coded records and analyzed by independent reviewers for accuracy. The investigator for each scheme exhaustively searched a sample of coded records for duplications.
Results SNOMED was judged to be significantly more complete in coding the source material than the other schemes (SNOMED* 70%; READ 57%; UMLS 50%; *p <.00001). SNOMED also had a richer clinical taxonomy judged by the number of acceptable first-degree relatives per coded concept (SNOMED* 4.56; UMLS 3.17; READ 2.14, *p <.005). Only the UMLS provided any definitions; these were found for 49% of records which had a coding assignment. READ and UMLS had better administrative mappings (composite score: READ* 40.6%; UMLS* 36.1%; SNOMED 20.7%, *p <. 00001), and SNOMED had substantially more duplications of coding assignments (duplication rate: READ 0%; UMLS 4.2%; SNOMED* 13.9%, *p <. 004) associated with a loss of clarity.
Conclusion No major terminology source can lay claim to being the ideal resource for a computer-based patient record. However, based upon this analysis of releases for April 1995, SNOMED International is considerably more complete, has a compositional nature and a richer taxonomy. It suffers from less clarity, resulting from a lack of syntax and evolutionary changes in its coding scheme. READ has greater clarity and better mapping to administrative schemes (ICD-10 and OPCS-4), is rapidly changing and is less complete. UMLS is a rich lexical resource, with mappings to many source vocabularies. It provides definitions for many of its terms. However, due to the varying granularities and purposes of its source schemes, it has limitations for representation of clinical concepts within a computer-based patient record.