J Am Med Inform Assoc 17:383-388 doi:10.1136/jamia.2010.004804
  • Application of information technology

Extracting timing and status descriptors for colonoscopy testing from electronic medical records

  1. Neeraja B Peterson2
  1. 1Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee, USA
  2. 2Division of General Internal Medicine and Public Health, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
  3. 3Veterans Administration, Tennessee Valley Geriatric Research Education Clinical Center (GRECC), Nashville, Tennessee, USA
  1. Correspondence to Dr Joshua C Denny, Eskind Biomedical Library, Room 442, 2209 Garland Ave, Nashville, TN 37232, USA; josh.denny{at}
  • Received 12 August 2009
  • Accepted 30 April 2010


Colorectal cancer (CRC) screening rates are low despite confirmed benefits. The authors investigated the use of natural language processing (NLP) to identify previous colonoscopy screening in electronic records from a random sample of 200 patients at least 50 years old. The authors developed algorithms to recognize temporal expressions and ‘status indicators’, such as ‘patient refused’, or ‘test scheduled’. The new methods were added to the existing KnowledgeMap concept identifier system, and the resulting system was used to parse electronic medical records (EMR) to detect completed colonoscopies. Using as the ‘gold standard’ expert physicians' manual review of EMR notes, the system identified timing references with a recall of 0.91 and precision of 0.95, colonoscopy status indicators with a recall of 0.82 and precision of 0.95, and references to actually completed colonoscopies with recall of 0.93 and precision of 0.95. The system was superior to using colonoscopy billing codes alone. Health services researchers and clinicians may find NLP a useful adjunct to traditional methods to detect CRC screening status. Further investigations must validate extension of NLP approaches for other types of CRC screening applications.


  • Funding This study was funded by National Cancer Institute grant R21 CA116573 and National Library of Medicine grant R01 LM007995.

  • Competing interests None

  • Ethics approval This study was conducted with the approval of the Vanderbilt University.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Free Sample

This recent issue is free to all users to allow everyone the opportunity to see the full scope and typical content of JAMIA.
View free sample issue >>

Access policy for JAMIA

All content published in JAMIA is deposited with PubMed Central by the publisher with a 12 month embargo. Authors/funders may pay an Open Access fee of $2,000 to make the article free on the JAMIA website and PMC immediately on publication.

All content older than 12 months is freely available on this website.

AMIA members can log in with their JAMIA user name (email address) and password or via the AMIA website.

Navigate This Article