Use of population health data to refine diagnostic decision-making for pertussis
- Andrew M Fine1,
- Ben Y Reis1,2,
- Lise E Nigrovic1,
- Donald A Goldmann3,
- Tracy N LaPorte4,
- Karen L Olson1,2,
- Kenneth D Mandl1,2,5
- 1Division of Emergency Medicine, Children's Hospital Boston and Department of Pediatrics, Harvard Medical School, Boston, Massachusetts, USA
- 2Children's Hospital Informatics Program at the Harvard-MIT, Division of Health Sciences and Technology, Boston, Massachusetts, USA
- 3Division of Infectious Diseases, Children's Hospital Boston and Harvard Medical School, Boston, Massachusetts, USA
- 4Massachusetts Department of Public Health, Jamaica Plain, Massachusetts, USA
- 5The Manton Center for Orphan Disease Research, Children's Hospital Boston, Boston, Massachusetts, USA
- Correspondence to Dr A M Fine, Division of Emergency Medicine—Main 1, Children's Hospital Boston, 300 Longwood Avenue, Boston, MA 02115, USA;
Contributors All authors made substantial contributions to conception, design, analysis, and interpretation of data. AMF and KDM drafted the manuscript, and all authors were involved in revising it critically for important intellectual content and final approval. AMF is guarantor.
- Received 12 May 2008
- Accepted 23 August 2009
Objective To improve identification of pertussis cases by developing a decision model that incorporates recent, local, population-level disease incidence.
Design Retrospective cohort analysis of 443 infants tested for pertussis (2003–7).
Measurements Three models (based on clinical data only, local disease incidence only, and a combination of clinical data and local disease incidence) to predict pertussis positivity were created with demographic, historical, physical exam, and state-wide pertussis data. Models were compared using sensitivity, specificity, area under the receiver-operating characteristics (ROC) curve (AUC), and related metrics.
Results The model using only clinical data included cyanosis, cough for 1 week, and absence of fever, and was 89% sensitive (95% CI 79 to 99), 27% specific (95% CI 22 to 32) with an area under the ROC curve of 0.80. The model using only local incidence data performed best when the proportion positive of pertussis cultures in the region exceeded 10% in the 8–14 days prior to the infant's associated visit, achieving 13% sensitivity, 53% specificity, and AUC 0.65. The combined model, built with patient-derived variables and local incidence data, included cyanosis, cough for 1 week, and the variable indicating that the proportion positive of pertussis cultures in the region exceeded 10% 8–14 days prior to the infant's associated visit. This model was 100% sensitive (p<0.04, 95% CI 92 to 100), 38% specific (p<0.001, 95% CI 33 to 43), with AUC 0.82.
Conclusions Incorporating recent, local population-level disease incidence improved the ability of a decision model to correctly identify infants with pertussis. Our findings support fostering bidirectional exchange between public health and clinical practice, and validate a method for integrating large-scale public health datasets with rich clinical data to improve decision-making and public health.
Bordetella pertussis outbreaks can infect hundreds of people across all age groups, though the infection is most dangerous for young infants.1 2 3 Pertussis is difficult to diagnose, especially in its early stages, and definitive test results are not available for several days. Timely administration of antibiotics decreases transmissibility of the disease.4 Most patients with cough do not have pertussis, but a missed case of the contagious disease is likely to have important consequences for the patient, her contacts, and the public health.5 A patient's risk of exposure to infection varies by local disease burden,6 7 though clinicians rarely have ready access to information about epidemiologic context8—the recent regional incidence of an infectious disease—when making management decisions. The proliferation of real-time infectious disease surveillance systems9 10 and electronic laboratory reporting systems11 creates an opportunity to open a key communications channel between public health agencies and point-of-care providers. Currently there are no clinical decision-support systems that integrate public health incidence data into management algorithms in real time.12
Because of the temporal and geographic variability of pertussis outbreaks, the delay in diagnostic test results, and the personal and public health ramifications13 of incorrect management decisions at the point of care, pertussis is a prototypical disease for which real-time public health incidence data might inform, guide, and improve clinical decision-making. The purpose of this study is to quantify the value of recent, local disease incidence, derived from public health sources, in improving management of pertussis in the clinical setting.
Design, setting, and subjects
A retrospective review was conducted of charts for infants tested for pertussis by culture, presenting to the pediatric emergency department (ED) of a large urban tertiary care US hospital from 1 January 2003 to 31 December 2007. The ED volume exceeds 50 000 patients per year. The study received institutional review board approval.
Inclusion and exclusion criteria
Subjects included all infants tested for pertussis by culture from 2003 to 2007. If a patient had multiple pertussis cultures from 2003 to 2007, only the first test was included.
An infant was defined as pertussis-positive or pertussis-negative based on culture result, which is widely regarded as the gold standard.14 15 Alternate tests like PCR, serology, and direct fluorescent antibody (DFA) were not used in the case definition. Positive culture from a nasopharyngeal specimen is 100% specific for pertussis.4 16 Sensitivity, however, may be limited for several reasons including the organism's fastidious nature, specimen collection technique, when the patient is tested in the course of the illness, and prior or concurrent use of antibiotics.16 17 While PCR may have a better sensitivity, we did not rely on it because there is no FDA-approved test kit available, because test characteristics vary widely by laboratory and because outbreaks have recently been attributed to PCR false positives.4 18 PCR may, in fact, be oversensitive, and requires correlation with at least 2 weeks of cough and paroxysm, whoop or post-tussive emesis,4 which are difficult to assess accurately in a retrospective review. Serology is not recommended for infants, and DFA is not widely available.19
Clinical data collection
Demographics, signs, and symptoms commonly associated with infant pertussis, local disease incidence data and outcomes were collected for each patient.4 20 21 Demographics included visit date, gender, and age (months). Signs and symptoms included cough duration (days), fever duration (days), history of apnea, post-tussive emesis, cyanosis, seizure, and contact with a person with known pertussis. If the record did not contain information about these symptoms, they were coded as absent. Cough descriptors like paroxysm, staccato, and “whoop” were not included because they could not be measured accurately by chart review. Outcome data including antibiotic use, hospitalization, and mortality were collected to help describe the study population.
In the initial review, the pertussis culture result for each patient was obtained from the hospital laboratory information system. Subsequently, the chart abstractor (an attending physician specializing in pediatric emergency medicine) responsible for collecting and entering patient data into structured forms was blinded to the culture result. The culture result was accessible through a unique laboratory link to a PDF file from the external laboratory that performed the culture. These results were kept separated from the portion of the electronic chart used from the ED clinical encounter. No linkage between the culture result and the clinical portion of the chart was conducted until after all clinical charts had been reviewed. Historical and physical exam features were based on the EMR generated during the ED encounter. Outcome data were collected from the ED EMR, inpatient discharge summaries, and outpatient follow-up visits. To assess inter-rater reliability, a second abstractor (also an attending physician specializing in pediatric emergency medicine) reviewed a random sample of 7% of charts.22 23 The two chart abstractors had over 90% agreement (range 91–97%) and κ24 from 0.52 to 0.87 for all candidate predictors.
Local disease-incidence data collection
A query of the State Laboratory of the Massachusetts Department of Public Health database yielded 19 907 pertussis culture results from patients of all ages over the study period (2003–7). These data were obtained through a limited data sharing agreement. State data about cultures included date sent and culture result, but not demographics, clinical findings or outcomes.
Aggregate disease incidence variables were created for the number of pertussis cultures performed, the number of positives and the proportion positive at the state laboratory. Each of these variables was tabulated over a range of different timescales: 1–7, 8–14, 15–21, and 22–28 days prior to each visit date. Based on date of presentation, the corresponding public health incidence variables (number of cultures performed, positive, and proportion positive in the prior and cumulative 1–4 weeks) were assigned to each infant.
Building the decision models
The same sequence of steps was used to build three decision models: (1) “clinical only” model—candidate predictors included only clinical data based on demographics, history, and physical exam; (2) “local disease incidence” model—candidate predictors included only public health incidence data; and (3) “contextualized” model—all clinical and public health predictors were considered.
Variable discretization and selection
Dichotomous variables (history of apnea, post-tussive emesis, cyanosis, and seizure) associated with positive pertussis culture in the clinical data set were identified. Significance of association was tested with a χ2 goodness-of-fit test (p<0.05). Continuous variables (duration of cough, duration of fever, and local disease incidence variables) were dichotomized at categorical cut-offs considered by the clinical investigators to be clinically useful and easy to remember (eg, cough at least 1 week, presence of fever, and proportion positive past 21 days >0.10).
In the multivariate analysis, candidate variables were entered into a forward stepwise logistic regression to identify independent predictors of infants testing pertussis positive. Cut-offs for entry and departure for the logistic regression model were 0.25 and 0.10.
For the local disease incidence model, each variable was considered for entry into the model as an independent predictor. Because of the interdependence of these variables, it was established a priori that no more than one candidate incidence variable would be contained in the final model. Thresholds were defined for the numbers of tests performed, positive, and proportion positive over 1–4 weeks. For proportion positive, thresholds were tested from 0.01 to 0.20 in increments of 0.01.
For the contextualized model, each clinical and local disease incidence variable was considered for entry into the multivariate model. Variables not included in the final clinical only or final local disease incidence model were still considered for inclusion into the contextualized model.
After selecting the best final model for each analysis (clinical, local disease incidence, or contextualized), a bootstrap validation was performed. Predictors that were selected in over 50% of the 1000 bootstrap samples were retained in the final model.25 26 27
Measurement of model performance
Sensitivity, specificity, positive and negative predictive values, area under the receiver-operating characteristics (ROC) curve (AUC), and percent correct classification were used to compare performance. The best model was defined as that with the greatest specificity among those with highest sensitivity, in order to minimize missed pertussis cases, and also minimize misclassification of those without pertussis.
Comparing clinician performance with decision models
Clinicians' actual performance was compared with the clinical, local disease incidence, and contextualized models by measuring correct classification. Clinician performance of correct classification was judged by utilization or omission of antibiotics in the clinical encounter. The clinical actions taken, as determined by chart review, were compared with what would have been recommended based on the three decision models generated.
Four hundred and forty-three infants had a pertussis culture sent from 2003 to 2007, and 38 (8.4%) tested positive. Nineteen thousand nine hundred and seven cultures were performed at the State Laboratory Institute of the Massachusetts Department of Public Health during the study; 1103 (5.5%) tested positive. For these 19 907 cultures, the weekly proportion positive ranged from 0 to 32% (mean 6.8%, median 5.6%, interquartile range (IQR) 3.3 to 8.8%). A mean of 4.2 cultures tested positive each week (median 4, range 0 to 20, IQR 2 to 6). Weekly and monthly (figure 1A,B) proportion positive at the state laboratory demonstrate that the timing, height, width, and total number of pertussis peaks vary annually.
Development of “clinical only” decision model
Infants testing positive for pertussis were younger, more likely to have a history of apnea or cyanosis, or cough for at least 1 week and were less likely to have fever than those who tested negative (table 1). There were no significant differences between those testing positive and negative for gender, history of post-tussive emesis or seizure, or exposure to a contact with known pertussis.
History of cyanosis was the best predictor of pertussis, followed by history of cough for at least 1 week and absence of fever (table 2). Gender, history of post-tussive emesis or seizure, exposure to pertussis contact, and age in months were not included in the final clinical only model.
Development of “local disease incidence” decision model
Selection of incidence variables
In the 1–4 weeks prior to each visit date, the ranges of mean numbers of cultures performed (102 to 110), positive (4.7 to 4.9) and proportion positive (0.058–0.060) showed a small variation. Because means, medians, ranges, and SD correlated closely for 1–7, 8–14, 15–21, and 22–28 days, metrics from a single time interval were chosen to represent “local disease incidence.” Due to the time required to definitively process a pertussis culture,4 8–14 days (2 weeks) was chosen as the best metric available for a clinical application. A range of thresholds were examined to determine the cut-off that would optimize specificity for maximum sensitivity. The area under the ROC curve for these variables was maximized when the proportion positive from 8 to 14 days prior exceeded 0.10 (p<0.0001).
Development of contextualized decision model
Among clinical variables, presence of cyanosis and cough for at least 1 week met criteria for selection into the final logistic regression model (table 3). The proportion positive 8–14 days prior to the test date also met criteria for selection into the final logistic regression model. Proportion positive thresholds ranging from 0.01 to 0.20 in increments of 0.01 were tested. In conjunction with the clinical variables above, the maximal area under the ROC curve occurred with a cut-off of 0.10. The best contextualized model contained three variables—history of cyanosis, cough at least 1 week, and proportion positive >0.10 eight to 14 days earlier. The incidence variable was a stronger predictor than any clinical factor considered, except for history of cyanosis.
All predictors from the multivariate analyses were validated by the bootstrap method and retained in the final models. Cyanosis was selected in over 99%, public health pertussis proportion positive ≥0.10 eight to 14 days prior in over 90%, and cough for at least 1 week in over 80% of 1000 bootstrap samples.
Measurement of performance of decision models
The best model derived in the clinical only analysis (history of cyanosis, cough for at least 1 week and absence of fever), generated an area under the ROC of 0.80, with 89% sensitivity and 27% specificity (table 4). Addition of variables not significant in the univariate analysis (gender, history of post-tussive emesis, history of apnea, history of seizure, exposure to known pertussis case, and age under 3 months) did not improve the area under the ROC. The best local disease incidence model achieved only 13% sensitivity and 53% specificity. In the best contextualized model (history of cyanosis, proportion positive 8–14 days earlier ≥0.10, and cough for at least 1 week) the area under the ROC was 0.82 with 100% sensitivity and 38% specificity.
The contextualized model outperformed the clinical and local disease incidence models across all metrics (table 4). Compared with the clinical model, the contextualized model achieved superior sensitivity (89–100), specificity (27–38), PPV (12–15), NPV (96–100), and area under the ROC (0.80–0.82).
Comparing clinician performance with decision models
The percent of positives treated with antibiotics and percent of negatives not treated with antibiotics were compared with hypothetical outcomes generated by the three decision models (table 5). The contextualized model missed no patients with pertussis. Among models that did not miss any cases (100% sensitivity), the contextualized model misclassified the fewest patients (62%, 95% CI 57% to 67%) without pertussis, which would have resulted in the most judicious antibiotic use and correct categorization of infants. While clinicians did not do as well as the contextualized model, clinicians outperformed the clinical only model, misclassifying about the same number of patients with pertussis (11% vs 13%) but misclassifying fewer patients without the disease (60% vs 73%) (table 5).
Clinicians make critical decisions in the face of uncertainty, and typically rely on individual clinical experience and discussion with close colleagues when making decisions about diagnosis and treatment.28 29 30 Previously, we showed that local disease incidence information about meningitis from a single hospital provides valuable epidemiologic context and enhances a decision model for distinguishing aseptic from bacterial meningitis.8 Here, we demonstrate for the first time how an external public health surveillance source improves a clinical decision model, by incorporating state-wide “epidemiologic context.” Previous prediction models, derived from small numbers of patients, have identified clinical predictors of infantile pertussis like cyanosis and cough, and some models have considered seasonality, but none have incorporated local disease incidence.20 21 Seasonality is not a substitute for accurate real-time information about pertussis incidence as pertussis outbreaks are sporadic and do not follow consistent seasonal or geographic patterns.15 31 32 33 34 In our analysis, epidemiologic context was stronger than all but one clinical predictor (cyanosis). This finding underscores the importance of “situational awareness” in the clinical setting. Understanding the epidemiologic context in which a patient presents may provide critical information about the etiology of the patient's problem, but currently, this type of information is not formally processed, considered, or utilized in clinical decision-making.
Our findings support a general approach of estimating clinical risk of disease at the point of care, accounting for local disease incidence. This approach uses epidemiologic context in the clinical decision-making process rather than relying solely on history, physical exam, heuristics, and preliminary diagnostic test results.10 29 35 36 It is becoming increasingly feasible to deliver public health information to clinicians at the point of care. The emergence of robust, real-time surveillance systems, automated reporting to public health, and widespread adoption of electronic health records present opportunities for bidirectional communication between clinical practice and public health. Our study also promotes the value of disease reporting and surveillance at the state level.
We demonstrate a useful synergy between clinical and public health information in the generation and refinement of clinical decision rules. Public health data have not previously been used to generate decision models because, while they contain detailed information about those who test positive, they contain limited information about those testing negative. Public health efforts focus on tracking, interviewing, and following patients with reportable diseases, so public health data sets contain far greater detail about individuals who test positive. Data about those testing negative are limited even further by patient privacy laws, which prohibit collection of detailed information about people without the reportable disease. This unbalanced data stream creates a unique challenge to the use of public health datasets for the creation of decision models, which rely on rich information about patients both with and without the disease.37 In an effort to use available high-quality data, we approached this problem by integrating a large statewide public health dataset with a detailed hospital-based clinical dataset to develop a decision model for a disease with major public health importance—pertussis.
The design of this study was retrospective, so a further validation would be necessary prior to integration into a clinical setting.38 The retrospective nature of the study also required basing the clinical models on patients who had pertussis tests ordered, potentially biasing toward subjects for whom clinicians already suspected of pertussis. While the contextualized model outperformed the clinical only model, the clinical only model is most limited by its retrospective nature. First, a prospectively derived clinical model where testing was based on symptoms and structured data were acquired systematically might improve the performance of a clinical model. Second, while the study was carried out at a single site, this site provides care for 75% of the children who live in and around this large metropolitan area. Third, the incidence data are state-wide, while the patients are from a single, large metropolitan area. Fourth, we relied on the most conservative method for evaluating pertussis—culture—because it is widely regarded as the gold standard.14 15 As delineated in the case definition section of the methods, PCR may be oversensitive, and requires correlation with at least 2 weeks of cough and paroxysm, whoop, or post-tussive emesis,4 which are difficult to assess accurately in a retrospective review. Serology is not recommended for infants, and DFA is not widely available.19 Fifth, the study was limited by a lack of immunization data on the subjects because primary care records were not accessible for these ED patients. Sixth, most patients did not have blood tests performed as part of the evaluation, and so lymphocytosis could not be included in the models; however, while lymphocytosis is classically associated with pertussis, it has been shown to be neither sensitive nor specific.39
This study validates a scientific method for integrating incidence data into a clinical decision model and suggests that “epidemiologic context” could be an important component of future clinical decision-support systems. A software application integrated with an electronic health record might display data to physicians about ambient public health conditions and prompt appropriate management, treatment and reporting processes based on a calculation that considered patient factors in a specific epidemiologic situation. This important refinement of clinical decision-making requires communication between public health and clinical settings, and programs to enable integration of public health data with clinical environments.
Funding This work was supported by grants K01HK000055 and 1 P01 HK000088 from the Centers for Disease Control and Prevention and by G08LM009778 and R01 LM007677 from the National Library of Medicine.
Competing interests None.
Ethics approval The Committee on Clinical Investigation of Children's Hospital Boston approved the study.
Provenance and peer review Not commissioned; externally peer reviewed.