rss
JAMIA 2008;15:496-505 doi:10.1197/jamia.M2599
  • Focus on Consumer Health Informatics
  • Research Paper

Consumer Health Concepts That Do Not Map to the UMLS: Where Do They Fit?

  1. Alla Keselmana,b,
  2. Catherine Arnott Smithc,
  3. Guy Divitaa,d,
  4. Hyeoneui Kime,
  5. Allen C Brownea,
  6. Gondy Leroyf,
  7. Qing Zeng-Treitlere
  1. aLister Hill National Center for Biomedical Communications, National Library of Medicine, National Institutes of Health, Bethesda, MD
  2. bAquilent, Inc., Laurel, MD
  3. cSchool of Library and Information Studies, University of Wisconsin, Madison, WI
  4. dLockheed Martin, Inc., Bethesda, MD
  5. eDecision Systems Group, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
  6. fSchool of Information Systems and Technology, Claremont Graduate University, Claremont, CA
  1. Correspondence: Alla Keselman, PhD, MA, 7S713E, LHNCBC, National Library of Medicine, NIH, 8600 Rockville Pike, Bethesda, MD; e-mail: <keselmana{at}mail.nih.gov>
  • Received 20 August 2007
  • Accepted 8 February 2008

Abstract

Objective This study has two objectives: first, to identify and characterize consumer health terms not found in the Unified Medical Language System (UMLS) Metathesaurus (2007 AB); second, to describe the procedure for creating new concepts in the process of building a consumer health vocabulary. How do the unmapped consumer health concepts relate to the existing UMLS concepts? What is the place of these new concepts in professional medical discourse?

Design The consumer health terms were extracted from two large corpora derived in the process of Open Access Collaboratory Consumer Health Vocabulary (OAC CHV) building. Terms that could not be mapped to existing UMLS concepts via machine and manual methods prompted creation of new concepts, which were then ascribed semantic types, related to existing UMLS concepts, and coded according to specified criteria.

Results This approach identified 64 unmapped concepts, 17 of which were labeled as uniquely “lay” and not feasible for inclusion in professional health terminologies. The remaining terms constituted potential candidates for inclusion in professional vocabularies, or could be constructed by post-coordinating existing UMLS terms. The relationship between new and existing concepts differed depending on the corpora from which they were extracted.

Conclusion Non-mapping concepts constitute a small proportion of consumer health terms, but a proportion that is likely to affect the process of consumer health vocabulary building. We have identified a novel approach for identifying such concepts.

Footnotes

  • This project was supported the National Institutes of Health (NIH) grant R01 LM07222, the Intramural Research Program of the NIH, NLM and 2003 Donald A.B. Lindberg Research Fellowship sponsored by the Medical Library Association. The authors thank the National Library of Medicine (NLM) for sharing the MedlinePlus® query log data. The authors thank Sergey Goryachev for his technical help with the project.

  • * In the case of Set A, the process required a preliminary step of extracting a larger pool of high-frequency n-grams and manually reviewing them for “termhood”, since many high-frequency machine-extracted n-grams were not deemed true health terms (e.g., “treated with”).

Access policy for JAMIA

All content published in JAMIA is deposited with PubMedCentral by the publisher but with varying embargo times. Authors/funders may pay an Unlocked fee of $2,000 to make the article free on the JAMIA website and PMC immediately on publication. Research funded by government and other recognised agencies is deposited with a 12 month embargo. All other content is deposited with a 36 month embargo.

The Journal of the American Medical Informatics Association is published for the American Medical Informatics Association by BMJ Publishing Group Ltd.