J Am Med Inform Assoc 18:540-543 doi:10.1136/amiajnl-2011-000465
  • Editorial

Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions

  1. Ozlem Uzuner8
  1. 1Department of Biomedical Informatics, University of California San Diego, La Jolla, California, USA
  2. 2Yale University, New Haven, Connecticut, USA
  3. 3The MITRE Corporation, Bedford, Massachusetts, USA
  4. 4Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), Boston, MA, USA
  5. 5Harvard Medical School, Division of Aging, Boston, MA, USA
  6. 6Center for Surgery and Public Health, Brigham and Women's Hospital, Boston, MA, USA
  7. 7Children's Hospital Boston Informatics Program, Harvard Medical School, Boston, Massachusetts, USA
  8. 8University at Albany-SUNY, Albany, New York, USA
  1. Correspondence to Dr Wendy W Chapman, Department of Biomedical Informatics, University of California San Diego, 9500 Gilman Dr, Bldg 2 #0728, La Jolla, California, USA; wwchapman{at}
  • Accepted 6 July 2011

This issue of JAMIA focuses on natural language processing (NLP) techniques for clinical-text information extraction. Several articles are offshoots of the yearly ‘Informatics for Integrating Biology and the Bedside’ (i2b2) ( NLP shared-task challenge, introduced by Uzuner et al (see page 552)1 and co-sponsored by the Veteran's Administration for the last 2 years. This shared task follows long-running challenge evaluations in other fields, such as the Message Understanding Conference (MUC) for information extraction,2 TREC3 for text information retrieval, and CASP4 for protein structure prediction. Shared tasks in the clinical domain are recent and include annual i2b2 Challenges that began in 2006, a challenge for multi-label classification of radiology reports sponsored by Cincinnati Children's Hospital in 2007,5 a 2011 Cincinnati Children's Hospital challenge on suicide notes,6 and the 2011 TREC information retrieval shared task involving retrieval of clinical cases from narrative records.7

Although NLP research in the clinical domain has been active since the 1960s, progress in the development of NLP applications for clinical text has been slow and lags behind progress in the general NLP domain. There are several barriers to NLP development in the clinical domain, and shared tasks like the i2b2/VA Challenge address some of these barriers. Nevertheless, many barriers remain and unless the community takes a more active role in developing novel approaches for addressing the barriers, advancement and innovation will continue to be slow.

Barriers to NLP development in the clinical domain

Historically, there have been substantial barriers to NLP development in the clinical domain. These barriers are not unique to the clinical domain: they also occur in the fields of software engineering and general NLP.

Lack of access to shared data

Because of concerns regarding patient privacy and worry about revealing unfavorable institutional practices, hospitals and clinics have been extremely reluctant to allow access to clinical data for researchers from outside the …

Related Article

Free Sample

This recent issue is free to all users to allow everyone the opportunity to see the full scope and typical content of JAMIA.
View free sample issue >>

Access policy for JAMIA

All content published in JAMIA is deposited with PubMed Central by the publisher with a 12 month embargo. Authors/funders may pay an Open Access fee of $2,000 to make the article free on the JAMIA website and PMC immediately on publication.

All content older than 12 months is freely available on this website.

AMIA members can log in with their JAMIA user name (email address) and password or via the AMIA website.

Navigate This Article