PROTEMPA: A Method for Specifying and Identifying Temporal Sequences in Retrospective Data for Patient Selection
- Affiliation of authors: Division of Clinical Informatics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA
- Correspondence and reprints: Andrew R. Post, MD, PhD, Department of Public Health Sciences, University of Virginia, Suite 3181 West Complex, P.O. Box 800717, Charlottesville, VA 22908-0717; email: <arp4m{at}virginia.edu>
- Received 13 September 2006
- Accepted 11 June 2007
Abstract
Objective To specify and identify disease and patient care processes represented by temporal patterns in clinical events and observations, and retrieve patient populations containing those patterns from clinical data repositories, in support of clinical research, outcomes studies, and quality assurance.
Design A data processing method called PROTEMPA (Process-oriented Temporal Analysis) was developed for defining and detecting clinically relevant temporal and mathematical patterns in retrospective data. PROTEMPA provides for portability across data sources, “pluggable” data processing environments, and the creation of libraries of pattern definitions and data processing algorithms.
Measurements A proof-of-concept implementation of PROTEMPA in Java was evaluated against standard SQL queries for its ability to identify patients from a large clinical data repository who show the features of HELLP syndrome, and categorize those patients by disease severity and progression based on time sequence characteristics in their clinical laboratory test results. Results were verified by manual case review.
Results The proof-of-concept implementation was more accurate than SQL in identifying patients with HELLP and correctly assigned severity and disease progression categories, which was not possible using SQL only.
Conclusions PROTEMPA supports the identification and categorization of patients with complex disease based on the characteristics of and relationships between time sequences in multiple data types. Identifying patient populations who share these types of patterns may be useful when patient features of interest do not have standard codes, are poorly-expressed in coding schemes, may be inaccurately or incompletely coded, or are not represented explicitly as data values.
Footnotes
-
This work was supported in part by National Library of Medicinegrant R01 LM008192 and National Library of Medicine TrainingGrant T15 LM007059.
-
The authors thank Vanathi Gopalakrishnan for valuable advice.








