rss
J Am Med Inform Assoc 2000;7:42-54 doi:10.1136/jamia.2000.0070042
  • Original Investigation
  • Research Paper

Exploring the Degree of Concordance of Coded and Textual Data in Answering Clinical Queries from a Clinical Data Repository

  1. H David Stein,
  2. Prakash Nadkarni,
  3. Joseph Erdos,
  4. Perry L Miller
  1. Affiliation of the authors: Yale University School of Medicine, New Haven, Connecticut
  1. Correspondence and reprints: H. David Stein, MD, Yale University School of Medicine, Center for Medical Informatics, 333 Decar Street, P.O. Box 208009, New Haven, CT 06520-8009; e-mail: 〈hdavid.stein{at}yale.edu
  • Received 31 March 1999
  • Accepted 16 August 1999

Abstract

Objective To query a clinical data repository (CDR) for answers to clinical questions to determine whether different types of fields (coded and free text) would yield confirmatory, complementary, or conflicting information and to discuss the issues involved in producing the discrepancies between the fields.

Methods The appropriate data fields in a subset of a CDR (5,135 patient records) were searched for the answers to three questions related to surgical procedures. Each search included at least one coded data field and at least one free-text field. The identified free-text records were then searched manually to ensure correct interpretation. The fields were then compared to determine whether they agreed with each other, were supportive of each other, contained no entry (absence of data), or were contradictory.

Results The degree of concordance varied greatly according to the field and the question asked. Some fields were not granular enough to answer the question. The free-text fields often gave an answer that was not definitive. Absence of data was most logically interpreted in some cases as lack of completion of data and in others as a negative answer. Even with a question as specific as which side a hernia was on, contradictory data were found in 5 to 8 percent of the records.

Conclusions Using the data in the CDR to answer clinical questions can yield significantly disparate results depending on the question and which data fields are searched. A database cannot just be queried in automated fashion and the results reported. Both coded and textual fields must be searched to obtain the fullest assessment. This can be expected to result in information that may be confirmatory, complementary, or conflicting. To yield the most accurate information possible, final answers to questions require human judgment and may require the gathering of additional information.

Footnotes

  • This work was supported in part by NIH grants T15-LM07056 and G08-LM05583 from the National Library of Medicine.

Access policy for JAMIA

All content published in JAMIA is deposited with PubMed Central by the publisher with a 12 month embargo. Authors/funders may pay an Unlocked fee of $2,000 to make the article free on the JAMIA website and PMC immediately on publication.

All content older than 12 months is freely available on this website.

AMIA members can log in with their JAMIA user name (email address) and password or via the AMIA website.