rss
J Am Med Inform Assoc 8:80-91 doi:10.1136/jamia.2001.0080080
  • Original Investigation

UMLS Concept Indexing for Production Databases

Table 1

The Results of Concept Matching

Match Type Training Set (100 documents) Test Set (24 documents)
No. of Matches Percentage Distinct Concepts No. of Matches Percentage
True positive 7,227 82.6% 2,268 1,298 76.3%
Redundant UMLS concept 490 5.6% 209 119 7.0%
Homonym 481 5.5% 127 45 2.6%
UMLS general concept missing 158 1.8% 86 38 2.2%
Concept not in UMLS 127 1.5% 31 42 2.5%
FP, acronym/abbrev 83 0.9% 51 15 0.9%
FN, variant not in UMLS 41 0.5% 16 44 2.6%
FN, inferable by ctx/expert 38 0.4% 12 31 1.8%
FN, acronym/abbreviation/elision 29 0.3% 6 37 2.2%
Concept not useful for indexing 25 0.3% 7 6 0.4%
Too many non-stop-words 25 0.3% 25 7 0.4%
FN, spelling/grammar error 8 0.1% 8 0 0.0%
FN/FP, proper name 10 0.1% 10 19 1.1%
FP, spelling/grammar error 3 0.0% 3 0 0.0%
totals: 8,745 2,859 1,701
  • NOTES: The three columns indicate category of match, the number of matches for each category, and the number of distinct concepts matched. FN indicates false negative; FP, false positive. The number of negated concepts in the test set was 110.

Access policy for JAMIA

All content published in JAMIA is deposited with PubMed Central by the publisher with a 12 month embargo. Authors/funders may pay an Unlocked fee of $2,000 to make the article free on the JAMIA website and PMC immediately on publication.

All content older than 12 months is freely available on this website.

AMIA members can log in with their JAMIA user name (email address) and password or via the AMIA website.