rss
JAMIA 2007;14:674-683 doi:10.1197/jamia.M2275
  • Original Investigation
  • Model Formulation

PROTEMPA: A Method for Specifying and Identifying Temporal Sequences in Retrospective Data for Patient Selection

  1. Andrew R Post,
  2. James H Harrison Jr
  1. Affiliation of authors: Division of Clinical Informatics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA
  1. Correspondence and reprints: Andrew R. Post, MD, PhD, Department of Public Health Sciences, University of Virginia, Suite 3181 West Complex, P.O. Box 800717, Charlottesville, VA 22908-0717; email: <arp4m{at}virginia.edu>
  • Received 13 September 2006
  • Accepted 11 June 2007

Abstract

Objective To specify and identify disease and patient care processes represented by temporal patterns in clinical events and observations, and retrieve patient populations containing those patterns from clinical data repositories, in support of clinical research, outcomes studies, and quality assurance.

Design A data processing method called PROTEMPA (Process-oriented Temporal Analysis) was developed for defining and detecting clinically relevant temporal and mathematical patterns in retrospective data. PROTEMPA provides for portability across data sources, “pluggable” data processing environments, and the creation of libraries of pattern definitions and data processing algorithms.

Measurements A proof-of-concept implementation of PROTEMPA in Java was evaluated against standard SQL queries for its ability to identify patients from a large clinical data repository who show the features of HELLP syndrome, and categorize those patients by disease severity and progression based on time sequence characteristics in their clinical laboratory test results. Results were verified by manual case review.

Results The proof-of-concept implementation was more accurate than SQL in identifying patients with HELLP and correctly assigned severity and disease progression categories, which was not possible using SQL only.

Conclusions PROTEMPA supports the identification and categorization of patients with complex disease based on the characteristics of and relationships between time sequences in multiple data types. Identifying patient populations who share these types of patterns may be useful when patient features of interest do not have standard codes, are poorly-expressed in coding schemes, may be inaccurately or incompletely coded, or are not represented explicitly as data values.

Footnotes

  • This work was supported in part by National Library of Medicinegrant R01 LM008192 and National Library of Medicine TrainingGrant T15 LM007059.

  • The authors thank Vanathi Gopalakrishnan for valuable advice.

Access policy for JAMIA

All content published in JAMIA is deposited with PubMedCentral by the publisher but with varying embargo times. Authors/funders may pay an Unlocked fee of $2,000 to make the article free on the JAMIA website and PMC immediately on publication. Research funded by government and other recognised agencies is deposited with a 12 month embargo. All other content is deposited with a 36 month embargo.

The Journal of the American Medical Informatics Association is published for the American Medical Informatics Association by BMJ Publishing Group Ltd.