A Bayesian spatiotemporal method for disease outbreak detection
 Correspondence to Dr Xia Jiang, 183M Parkvale Building, 200 Meyran Avenue, Pittsburgh, PA 15260, USA; xij6{at}pitt.edu
 Received 4 May 2009
 Accepted 27 April 2010
Abstract
A system that monitors a region for a disease outbreak is called a disease outbreak surveillance system. A spatial surveillance system searches for patterns of disease outbreak in spatial subregions of the monitored region. A temporal surveillance system looks for emerging patterns of outbreak disease by analyzing how patterns have changed during recent periods of time. If a nonspatial, nontemporal system could be converted to a spatiotemporal one, the performance of the system might be improved in terms of early detection, accuracy, and reliability.
A Bayesian network framework is proposed for a class of spacetime surveillance systems called BNST. The framework is applied to a nonspatial, nontemporal disease outbreak detection system called PC in order to create the spatiotemporal system called PCTS.
Differences in the detection performance of PC and PCTS are examined. The results show that the spatiotemporal Bayesian approach performs well, relative to the nonspatial, nontemporal approach.
 Disease outbreak detection
 biosurveillance
 Bayesian networks
 spatial scan statistic
 spatiotemporal
 influenza
 cryptosporidiosis
Introduction
Early detection of disease outbreaks remains an important research topic. Even modest improvement in disease outbreak detection could have significant impact on public health in terms of lives saved and reduced economic cost. The induced economic costs were estimated to be as high 26.2 billion dollars per 100 000 persons exposed1 and 200 million dollars per hour2 for a disease outbreak caused by a largescale release of inhalational anthrax.
Disease outbreak surveillance, also called disease outbreak detection and biosurveillance, consists of monitoring a community in order to recognize the onset of a disease outbreak early. A system that does this is called a disease outbreak surveillance system. Such a system looks for patterns of data that are indicative of an outbreak.
If an outbreak is occurring in a small subregion of a county, for example, and the system only investigates a global change for the entire county, the outbreak may go undetected until it spreads to a larger subregion. A spatial disease outbreak surveillance system searches for patterns in spatial subregions of the monitored region. In this way, we may detect the changes that start in small subregions earlier than if we only monitored the entire region globally. Another purpose of a spatial surveillance system is to identify the subregion where the outbreak is occurring. If a system only monitors the entire region globally, we call it nonspatial. A temporal event surveillance system looks for emerging outbreaks by analyzing how patterns have changed during recent periods of time. If a system does not do this, we call it nontemporal.
Even though the number of new cases during an outbreak generally exhibits an increasing trend, the daily fluctuation in this number may be dramatic. A nontemporal system, especially one that only looks at the current day's data, might issue an alarm one day when the number of new cases exceeds the normal level, and then not issue an alarm the next day when the number of new cases drops back down. As an example, the outbreak detection system PANDACDCA (PC)3 does not use a temporal model of disease outbreaks. Cooper et al3 obtained results that they initially found surprising when evaluating the ability of PC to detect a laboratory validated outbreak of influenza in Allegheny County, Pennsylvania. Under a false alarm rate of zero, PC detected influenza approximately one day before the first positive viral cultures of influenza were taken. However, near the beginning of the outbreak, the posterior probability of an influenza outbreak fluctuated between very high and very low values. It is generally not the case that an outbreak is present today, gone the next day, and back the following day, and so on. Rather the daily number of new cases during an outbreak can fluctuate. Our model should reflect how an outbreak behaves, including expecting typical temporal patterns of disease outbreaks. Otherwise, a nonoutbreak assessment that is issued during an outbreak could confuse the user of the system, thereby delaying the correct decision that an outbreak is occurring. A temporal system that expects an outbreak to result in an emerging increase in the number of outbreak cases over time, while allowing for daily variation, may be able to maintain stable detection and reduce false negative signals without compromising initial detection performance.
We introduce a method for converting an existing nonspatial, nontemporal Bayesian outbreak detection system to a spatiotemporal one. We apply the method to convert PC to the spatiotemporal system PANDACDCATemporalSpatial (PCTS). We hypothesize that: (1) when an outbreak is occurring in a small subregion, PCTS will detect the outbreak earlier than PC; (2) once an outbreak is detected, PCTS will more consistently maintain a detection signal than will PC; (3) when an outbreak is occurring in a small subregion, PCTS can estimate that subregion accurately; and (4) PCTS will perform at least as well as SaTScan, which is an existing stateoftheart spatial detection system. We show results of evaluating these hypotheses.
Background
An outbreak detection system looks for patterns of data that are indicative of an outbreak.
Some detection systems do this by analyzing counts of an observable event. For example, the number of patients presenting to an emergency department (ED) with respiratoryrelated chief complaints could be used to detect an influenza outbreak. Nonspatial, nontemporal methods of this type consider counts from some recent period of time only, such as the previous day, and do not investigate spatial subregions. A simple control chart method for analyzing these daily counts first derives the mean μ and SD σ of the counts during a nonoutbreak period. The method then issues an alert if the daily count exceeds μ by kσ, where k is ordinarily 2 or 3. Wong and Moore4 discuss this method and its variations.
Nonspatial, temporal methods that analyze counts look for changes in counts over time that are indicative of an outbreak, but they do not investigate spatial subregions. Such methods include the Serfling method,5 6 the ARMA, ARIMA, and SARIMA models,7 8 univariate hidden Markov models,9 10 support vector machines,11 12 CUSUM,13 and Bayesian network models.14
Nontemporal, spatial methods that analyze counts detect spatial clusters based only on the most recent counts of an observable event, as for example counts for the current day. These methods do not look for patterns of counts over time. Kulldorff15 developed a nontemporal, spatial method called the spatial scan statistic, which was implemented in the SaTScan software package.16 A multivariate version of the spatial scan statistic that considers several counts has been developed.17 Neill et al18 developed a Bayesian spatial scan statistic. This statistic was extended to a multivariate Bayesian spatial scan statistic.19
Rather than analyzing data aggregated over the entire population, another approach is to model the effects of the outbreak on each individual patient in population, which is a patientspecific approach. Using this approach, we can often more readily model complex changes in behavior. As discussed above, PANDACDCA (PC) is a nonspatial, nontemporal, patientspecific method that models the CDC category A outbreak diseases20 and several other diseases. The PC consists of a large Bayesian network that contains a set of nodes for each individual in a region. Populationwide ANomaly Detection and Assessment (PANDA)21 is a predecessor to PC; it is a patientspecific system designed to detect a windborne release of anthrax. It has a simple temporal and spatial representation that is much more limited than the approach we describe in the next section.
Even though conceptually a patientspecific system models each patient individually, nevertheless, at the computational level we may be able to analyze it using an aggregated set of counts. More generally, however, patientspecific models may become sufficiently complex that it is no longer feasible to solve them using aggregated counts; in that situation, we can apply other inference methods. See online appendix A for a derivation of the counts analyzed by PC.
We previously showed a method for converting a nonspatial Bayesian detection system to a spatial one.22 We also developed a method for converting a nontemporal Bayesian detection system to a temporal one.23 If we can combine the spatial method with the temporal one appropriately, we arrive at a framework for converting a nonspatial, nontemporal Bayesian outbreak detection system to a spatiotemporal one. We describe such a framework next.
Model description
The BNST framework
Suppose we are investigating whether there is an event of interest (eg, a disease outbreak) in some region. Let E be a random variable whose value is yes if the event of interest is occurring, and whose value is no otherwise. Besides the variable E, we assume a set of attribute variables that represent properties of the event of interest, a set of intermediate variables which depend on the properties of the event of interest, and a set of observable variables that depend on the intermediate variables. These observable variables comprise our Data or evidence. Figure 1 presents a Bayesian network framework representing the relationships among these variables. We call this the Bayesian network spatiotemporal (BNST) modeling framework and any model represented by it a BNST model. Although the framework is suitable for developing systems that detect many types of events, in this paper we focus on modeling disease outbreaks.
Three variables, SUB, Y, and F, are always included in the set of attribute variables. Where SUB represents the subregion in which an ongoing outbreak is occurring. The value of SUB can either be a hypothesized outbreak subregion S or the value none if there is no outbreak being hypothesized. The precise definition of a subregion depends on the application. For example, Kulldorff15 uses a circular subregion and the center of such a subregion is moved over the region. The radius of the circle is also allowed to vary. Neill et al18 represent the entire region G by an m×n rectangular grid, where each grid element is a cell. A subregion S of G consists of one or more cells that form a rectangle. Note that G itself is a subregion. Note also that subregions can overlap. F represents the severity of the outbreak on the day of investigation, if there is an outbreak. Y represents the number of days into the outbreak if there is an outbreak.
For the intermediate and observable variables, there is a set of these variables for today (day 0) and for each day preceding today (day i denotes i days prior to the current day). The probability distributions of the intermediate variables are conditioned on the values of SUB, Y, and F. The nature of this dependence also depends on the application. The data used for detection consist of information about the observable variables. The data for day i is denoted Data(i). If each intermediate variable represents a property of an individual, we would have a patientspecific model. However, the theory does not require a patientspecific model.
New data are obtained each day (or at whatever the time unit may be) from the entire region being monitored. The Bayesian network is then used to compute the posterior probability of an outbreak along with the posterior probability that each subregion S contains an outbreak.
The conversion of a simple model to a spatiotemporal model
Figure 2A shows a simple example of a nonspatial, nontemporal Bayesian network model. Suppose that the variable E has value yes if there is currently an outbreak of influenza and value no otherwise. There is a variable I_{r} for each individual r in the entire region being monitored. The possible values of I_{r} are the manifestations m_{k} for the individual. In this example, suppose that they are the chief complaints with which an individual might present to the ED, where one value is noED, which means the individual did not visit the ED. Other possible chief complaints include cough and fever/chills. Note that I_{r}=m_{k} is an assignment of chief complaint m_{k} for individual I_{r}. The probability distribution of I_{r} depends on whether or not E=yes. The Data consists of the values of I_{r} for each individual r in the entire region.
Figure 2B is the spatiotemporal Bayesian network that results from applying the framework in figure 1 to the model in figure 2A. If there is an outbreak (E=yes), suppose we assume that the probability of a subregion S containing the event is 1/b, where b is the number of subregions (ie, a uniform distribution over the subregions is assumed), while if there is no outbreak (E=no), the probability that there is no event in any subregion is 1. For each individual r, there is a location variable Loc_{r}, whose value is known at runtime and which represents the individual's home location, such as a zip code. For each value of Loc_{r} and each value S of SUB, we need to know whether Loc_{r} is in S. The Data(i) consists of the values of I_{r} (i) for each individual r in the entire region being monitored. These are the data obtained i days prior to the current day. The probability distribution of I_{r} (i) depends on whether there is an outbreak in subregion S, whether r is located in S, the severity F of the outbreak today, and the number of days Y into the outbreak. We do not present the quantitative probabilistic details of this dependence here. They are similar to the ones shown below for a more complex model.
A conversion of PC to a spatiotemporal model
This section describes the conversion of PC to a spatiotemporal model.
The PC model
Figure 3 shows the structure of PC.3 It has two features that make it unique among outbreak detection systems. First, it models a broad set of diseases. Other Bayesian approaches to outbreak detection of which we are aware either model a single disease or do not model any disease in particular. Second, PC represents the effect of an outbreak disease on individuals in the population. The current version of PC only uses chief complaints as evidence. However, the PC architecture allows us to model more complex evidence, such as images and narrative text in clinical reports. Such modeling goes well beyond using population counts of observable events.
Table 1 shows the variables in PC and what they represent. Variable O represents which outbreak disease is hypothesized to be occurring, if an outbreak is hypothesized. There are nine possible outbreak diseases resulting in 13 values for O. The diseases include the six CDC category A diseases, which are anthrax (two stages/values), plague (two stages/values), smallpox, tularemia, botulism, and hemorrhagic fever (two stages/values). PC also models influenza, cryptosporidiosis, and hepatitis A. The 13th value of O is none, which represents that there is no outbreak. The states of variable F were discretized into 15 real values. This variable indicates the severity of the outbreak. There is a variable D_{r} for each individual r in the population. Its possible values include all the values of variable O and also the value other which means the individual arrived in the ED only with a nonoutbreak disease (eg, a broken arm) and the value noED which means the individual did not visit the ED. There is a variable I_{r} for each individual in the region. Its possible values include 54 chief complaints, one of which is other, which means the chief complaint was not one of the 53 chief complaints represented in the network. The 55th value of I_{r} is noED, which means the individual did not visit the ED. Appendix A provides more information about PC, including its inference algorithm.
Extending PC to be a spatiotemporal system (PCTS)
We converted PC to the spatiotemporal system PCTemporalSpatial (PCTS) according to the BNST framework in figure 1. The Bayesian network structure of PCTS appears in figure 4. The PCTS computes the posterior probability of each outbreak disease based on the most recent T days of data. Table 2 shows the variables in PCTS that are not in PC.
As in Neill et al,18 the entire region is covered by an m×n rectangular grid, and each rectangular subregion of the grid constitutes a subregion. These rectangles are the possible values of SUB. PCTS assumes that the prior distribution of Y is uniform over {1, 2,…T}.
The variables E, O, and F in PCTS are the same variables as in PC. D_{r}(i) is the ED disease state of the rth individual i days ago, where i=0 represents today. It has the same values as variable D_{r} in PC. Its conditional probability distributions are discussed below. I_{r}(i) represents the chief complaint of the rth individual i days ago. It has all the same properties as variable I_{r} in PC. Its conditional probability distributions are the ones in PC.
The Bayesian network structure in figure 4 entails that, given values of SUB, Y, O, and F, the ED disease states (values of D_{r}(i)) for an individual on different days are independent. Jiang23 provides a justification for this independence assumption.
The probability distribution of D_{r}(i) is conditioned on O, F, and Y only when SUB=S and individual r is located in S. For each such value of SUB and individual r, the conditional distributions of D_{r}(i) are derived in the same manner for day i. For individuals not in S, the conditional probability distribution of D_{r}(i) is the same as the one when O equals none, and there is no dependence on F or Y. These are the same conditional distributions as those in PC.
We now describe how the conditional probability distributions of D_{r}(i) are determined for an individual r located in subregion S. Recall that the value f of F is the probability of an individual both having the outbreak disease and going to the ED today, given that an outbreak is ongoing. Early in the outbreak, which is when we hope to detect it, we assume that the increase in cases can be approximated by a linear function. As discussed in the next section, we tested the robustness of this assumption by simulating outbreaks that did not show a linear increase. We assume that the outbreak is not apparent on the first day it begins, that y days later it reaches level f on the current day, and that the increase over that period of time is linear. Let F(i) denote the probability of an individual being afflicted with the outbreak disease and going to the ED i days ago where i≤y. Based on the previous discussion, which implies that
The conditional probability distributions for D_{r}(i) are as follows. Assume that SUB=S and that individual r lives in region S. First
where p_{other} is the probability that an individual visits the ED with only a nonoutbreak disease, when there is no outbreak. This probability is estimated using ED data from the previous year.
In appendix A we show for the PC nontemporal system with outbreak disease d that
Therefore, for PCTS we have that
The inference algorithm for PCTS appears in appendix A.
Evaluation
In experiment 1 we compared PCTS to PC and to the multivariate, space–time extension of SatScan,24 which we designate as SaTScanMT.i In experiment 2 we further compared PCTS and SaTScanMT by evaluating how well they detected outbreaks emerging in time and space.
The dataset for experiment 1
Real ED admission data that were collected from Allegheny County, Pennsylvania in the year 2004 were used as background data. This dataset contains all 110 zip codes in Allegheny County. It contains an average of about 580 ED visits per day. We injected simulated outbreak cases into this real background data to create semisynthetic data. The observed data consisted of chief complaints presented by patients in the ED.
We considered both simulated influenza and cryptosporidiosis outbreaks. The simulated number of new cases increased according to linear, quadratic, and cubic functions during an outbreak before the outbreak reached its peak. For each outbreak disease, we generated 40 outbreaks for each type of increase. Therefore a total of 120 influenza outbreaks and 120 cryptosporidiosis outbreaks were simulated. We chose 40 outbreaks in part based on the statistical rule of thumb25 that when the sample size is greater than 40, and without outliers, the sampling distribution of many common statistics is approximately normally distributed. We also chose 40 in part for computational tractability reasons.
We simulated outbreaks in subregions in Allegheny County. This county covers 730 square miles, which we modeled using a 16×16 grid. Each grid element is one cell. A zip code was mapped to a cell if the zip code's centroid was in the cell. All 110 zip codes in Allegheny County were mapped to cells in this manner. All the cases for that zip code were placed in its centroid cell. The subregion where an outbreak was simulated to occur was stochastically selected from 12 possibilities. These 12 simulated outbreak subregions consisted of four 2×1 cells, four 2×2 cells, and four 3×2 cells rectangles.
To control the severity of the outbreak, we determined the number of new daily cases based on the SD of the number of real background daily ED visits in the subregion in which the outbreak was simulated. As mentioned previously, a simple control chart method issues an alert if the daily count of some observable event exceeds μ by kσ, where k is ordinarily 2 or 3. Using this traditional practice as a guideline, we considered an outbreak moderate if the average daily number of ED visits exceeded the mean number of such visits by about 2σ. So for each cell in the outbreak subregion, we determined the mean and SD σ_{cell} of the number of real ED visits during the entire year 2004. The severity level of the outbreak was based on a multiple of σ_{cell}. If, for example, the severity level was based on 2σ_{cell}, the average daily number of new cases was 2σ_{cell}. The duration of all outbreaks was set equal to 30 days. The multiples of σ_{cell} used for each epidemic curve function were as follows. For linear increasing outbreaks, severity level 1 used a daily average of 1.5σ_{cell} and severity level 2 used 2σ_{cell}; for quadratic increasing outbreaks the values were respectively 2σ_{cell} and 2.5σ_{cell}; and for cubic increasing outbreaks the values were respectively 2.5σ_{cell} and 3σ_{cell}. We used larger multiples in the case of the nonlinear functions because otherwise it would have taken too long for the number of new cases to reach a detectable level.
We made the simplifying assumption that the daily number of new outbreak cases reaches its peak in the middle of the outbreak and then declines in the same manner in which it increased. So it was assumed that half of the new cases occurred during the first half of the outbreak. In the case of outbreaks that methodically exhibited a linear increase in outbreak cases, we assumed that Δ of them occur on day one of the outbreak, 2Δ occur on day 2, and so on. The value of Δ can therefore be determined by solving Similar formulas were used for outbreaks that showed quadratic and cubic increases.
To force daily fluctuations, we deviated from simply making the number of new cases on day t equal to tΔ (linear case), t^{2}Δ (quadratic case), or t^{3}Δ (cubic case). We set two daily fluctuation levels, 25% and 50%. Half of the outbreaks were randomly selected to have the 25% level, and the remaining outbreaks were given the 50% level. If the level was 25%, on even numbered days we would make the number of new cases 25% of the previous day's number of new cases. Similarly if the level was 50%, on even numbered days we would make the number of new cases 50% of the previous day's number of new cases. We imposed daily fluctuations so we could evaluate the stability of the systems' outbreak assessments in the face of such variation.
Both PC and PCTS contain a distribution P of the chief complaints given the outbreak disease. Simulating chief complaints using P would be biased in favor of PC and PCTS. Therefore, to generate (via stochastic simulation) the chief complaint of each case, we used a variety of different chiefcomplaint probability distributions that differed significantly from P. For each type of outbreak (influenza or cryptosporidiosis) and for each type of increase (linear, quadratic, or cubic), 10 different distributions were used to generate the data. Each of these 10 distributions was applied to generate data for four of the 40 outbreaks. Appendix A and Jiang23 discuss these distributions in detail.
The dataset for experiment 2
We generated a total of 120 influenza outbreaks and 120 cryptosporidiosis outbreaks in the same manner as those used in experiment 1 except that the simulated outbreaks were also made to emerge in space. We modeled space emergence by injecting cases into one cell on days 1 and 2 of the outbreak, into two cells on days 3 and 4, into three cells on days 5 and 6, and so forth until all cells in the outbreak subregion were receiving new cases. We chose the first cell to receive outbreak cases by randomly selecting one of the cells in the outbreak subregion; we chose the second cell by randomly selecting a cell from all the cells that touched the first cell, and so on.
Converting patientspecific data to counts
Our datasets consist of semisynthetic, patientspecific ED data. However, SaTScanMT looks at one or more cumulative counts of events that are indicative of the event of interest. So, it was necessary to convert the patientspecific data to counts in order to provide input for SaTScanMT. Without loss of generality, let us discuss the data concerning influenza outbreaks. There are various ways we could develop the counts: (1) provide individual counts of each chief complaint; (2) provide individual counts of only the 20 chief complaints that are manifestations of influenza; or (3) provide individual counts of the few best manifestations of influenza (according to the probability distribution in PC). We did some preliminary experiments with all three approaches, and in the third approach we attempted to improve the input data for SaTScanMT by trying both the seven best manifestations and the three best manifestations. In these experiments, SaTScanMT performed best when method (3) was used with the three best manifestations. In this paper we only include that configuration. Appendix A contains additional details.
Evaluation methodology
We used AMOC curves26 to evaluate the systems' outbreak detection capabilities. In such curves, the false alarm rate is plotted on the xaxis and the mean days to detection on the yaxis. If an outbreak is not detected, the days to detection is set to a penalty. In our experiments we set it to 30, the number of days in the simulated outbreak.
The detection performances at various false alarm rates were further compared using a Bayesian test of significance, which is the posterior probability that one system's average time to detection (at a given false alarm rate) is greater than that of another system's under the assumption of prior ignorance (see Jiang23 for details). We will present the results using Bayesian posterior probabilities instead of p values.
We are also interested in how well the systems maintained the detection of an outbreak. The AMOCM curves are used to evaluate this, where the M stands for maintain. An AMOCM curve23 is like an AMOC curve except that the yaxis plots the average of the time at which an outbreak signal is detected and maintained thereafter. For example, if the outbreakdetection threshold is 0.05, and the sequence of daily outbreak posterior probabilities from a system is (0.01, 0.02, 0.07, 0.03, 0.04, 0.08, 0.09, 0.06, 0.08, 0.07), then the time at which the signal is maintained is 6 because on the 6th day the probability is 0.08, which exceeds 0.05, and it stays at or above 0.05 for the remaining days in the sequence that we are considering.
The PCTS and SaTScanMT not only detect an outbreak, but also determine the subregion in which the outbreak is occurring. We compared their accuracy concerning the detection of the correct subregion using the overlap coefficient. Let S be the actual outbreak subregion, T be the hypothesize outbreak subregion, and # return the number of zip codes in a subregion. Then
where ∩ denotes set intersection. The overlap coefficient is 0 if and only if the two subregions do not intersect (overlap), while it is 1 if and only if they overlap exactly.
Results for experiment 1
We evaluated how well PCTemporalSpatial (PCTS) and PC detected the outbreak disease that was injected (rather than just detecting that any outbreak is occurring). As discussed above, SaTScanMT was configured to detect the simulated outbreak disease. The posterior probability of the outbreak disease was used as the detection signal for PC and PCTS, and the likelihood ratio of the most likely subregion was used as the signal for SaTScanMT.
The results shown here concern the performance over all 120 outbreaks for each of influenza and cryptosporidiosis. Our purpose is to compare the systems' overall performances relative to a variety of outbreak behaviors.
Let P(μ_{PCc}>μ_{PCTSc}) be the posterior probability that PCTS has a smaller mean day to detection than PC for cryptosporidiosis outbreaks, and P(μ_{PCf}>μ_{PCTSf}) be that probability for influenza outbreaks. Let P(ν_{PCc}>ν_{PCTSc}) be the posterior probability that PCTS has a smaller mean day until maintaining detection than PC for cryptosporidiosis outbreaks, and P(ν_{PCf} >ν_{PCTSf}) be that probability for influenza outbreaks. For false alarm rates (r) equal to 0, 5, 10, and 15, our results showed that P(μ_{PCc} >μ_{PCTSc}), P(μ_{PCf}>μ_{PCTSf}), P(ν_{PCc}>ν_{PCTSc}), and P(ν_{PCf} >ν_{PCTSf}) are all greater than 0.9999. Each of these probabilities was obtained using Bayesian statistics under an assumption of prior ignorance. Jiang23 shows under the assumptions being made that such a posterior probability is equal to 1 minus the corresponding p value. For example, if P(μ_{PCc} > μ_{PCTSc})=0.96 then we can reject the hypothesis that μ_{PCc} is less than or equal to μ_{PCTSc} at the 0.04 significance level. It seems more straightforward to state our results using posterior probabilities rather than p values. These results support hypothesis 1, namely that when an outbreak is occurring in a small subregion, PCTS will detect the outbreak earlier than PC. The results also support hypothesis 2, namely that PCTS is better at maintaining the detection signal.
Table 3 shows the posterior probability that PCTS has a smaller mean day to detection (μ) than SaTScanMT and the posterior probability that PCTS has a smaller mean day until maintaining detection (ν) than SaTScanMT at various false alarm rates (r). The PCTS detected cryptosporidiosis outbreaks significantly earlier than SaTScanMT at false alarm rates of 0, 5, 10, and 15. However, SaTScanMT detected influenza outbreaks significantly earlier than PCTS at a false alarm rate of 5, while results for rates of 0, 10, and 15 were not significant (in the sense that the probability must be less than 0.05 or greater than 0.95 to be considered significant). We see also from table 3 that PCTS maintained detection of cryptosporidiosis outbreaks significantly earlier than SaTScanMT at all false alarm rates, but SaTScanMT maintained detection of influenza outbreaks significantly earlier than PCTS at most false alarm rates.
Figure 5A,B shows AMOC curves comparing the outbreak detection performance, while Figure 5C,D shows AMOCM curves comparing the detection maintenance performance. These curves are consistent with the results discussed above.
Appendix A (tables A.1 and A.2) contains an analysis of sensitivity that shows the fraction of the outbreaks that are ever detected. The sensitivity of PCTS is greater than that of PC at all false alert rates. The sensitivity of PCTS and SaTScanMT are both 100 per cent for cryptosporidiosis at all false alert rates. For influenza detection, SaTScanMT tends to have a sensitivity that is 5–10% better than that of PCTS, depending on the false alert rate.
The second and third columns of table 4 shows the posterior probability that PCTS has a larger average overlap coefficient (O) than SaTScanMT at various values of the Day into the outbreak. The subregion S that maximized was considered to be the subregion detected by PCTS, and the subregion that maximized the likelihood ratio was considered to be the one detected by SaTScanMT. Figure 6 shows the average values of the overlap coefficient for PCTS and SaTScanMT plotted against the Day into the outbreak. Notice that values fluctuate up and down on a daily basis. This is consistent with the imposed daily fluctuations in the number of new outbreak cases. The PCTS outperformed SaTScanMT for both types of outbreaks. Furthermore, its average value was about 0.7 by the seventh day of both types of outbreaks. These results support the hypothesis that when an outbreak is occurring in a small subregion, PCTS can closely estimate that subregion.
The average running times for the three systems, taken over all 240 outbreaks, were as follows: PC: < 1 s.; PCTS: 350 s.; SaTScanMT: 21 s. The implementation of SaTScanMT used four processors, whereas that of PCTS used only one. So the running time of PCTS compares a little more favorably than the times indicate, but it is still somewhat slower. Not surprisingly, PC's running time was much faster, given that it is based on a simpler model. We also expected that PCTS would be slower than SaTScanMT because PCTS is a more complex system that does more computations. As discussed above, PCTS is a true temporal system in that it models emerging patterns in time while SaTScanMT does not. Furthermore, PCTS is a patientspecific system which is normally more computationally costly than a system that takes accumulated counts as input.
Results for experiment 2
In this experiment, we further compared PCTS and SaTScanMT by evaluating how well they detected outbreaks that were emerging in both time and space. Table 5 shows the posterior probability that PCTS has a smaller mean day to detection (μ) than SaTScanMT and the posterior probability that PCTS has a smaller mean day to maintaining detection (ν) than SaTScanMT at various false alarm rates (r). The results concerning detection are similar to those for nonemerging outbreaks, except that PCTS faired better with the influenza outbreaks. That is, in the case of cryptosporidiosis outbreaks, PCTS performed significantly better than SaTScanMT at both detection and detection maintenance for emerging outbreaks. In the case of influenza outbreaks, PCTS overall performed better at detection, although not statistically significantly so, and in most cases did not perform statistically significantly worse at detection maintenance.
Figure 7 shows AMOC curves comparing PCTS and SaTScanMT when detecting emerging outbreaks. These curves are consistent with the results just discussed.
The fourth and fifth columns of table 4 show the posterior probability that PCTS has a larger average overlap coefficient (O) than SaTScanMT at various values of the Day into the outbreak for emerging outbreaks. The results are similar to those for nonemerging outbreaks. That is, PCTS usually outperformed SaTScanMT for both types of outbreaks.
Discussion
We developed a method for converting a nonspatial, nontemporal Bayesian outbreakdetection system to a spatiotemporal one. Using this framework, we extended the PC system to create the spatiotemporal system PCTS. We hypothesized that: (1) when an outbreak is occurring in a small subregion, PCTS will detect the outbreak earlier than PC; (2) PCTS is better than PC at maintaining the detection signal; (3) PCTS can closely estimate the subregion in which an outbreak is occurring; and (4) PCTS will perform as well or better than an existing stateoftheart spatial diseaseoutbreak detection system.
The experimental results support that PCTS provides improved disease outbreak detection, relative to PC from which it was derived. Besides often detecting outbreaks earlier (at a given false alert rate), PCTS was better at maintaining a stable detection signal over time.
The PCTS is a patientspecific system that models each individual in a population. We also hypothesized that such a system might obtain better detection performance than one that uses a summary statistic such as daily counts. To test this hypothesis, we compared PCTS to SaTScanMT. In the case of cryptosporidiosis outbreaks, PCTS performed substantially better than SaTScanMT at outbreak detection, detection maintenance, and subregion detection.
In the case of influenza outbreaks, however, PCTS performed at a level comparable to SaTScanMT at outbreak detection and worse at outbreak detection maintenance. SaTScanMT had a modestly better sensitivity for detecting influenza. The PCTS substantially outperformed SaTScanMT at subregion detection. One reason for these mixed results may be the following. Influenza outbreaks are fairly difficult to detect because influenza has symptoms such as cough and fever, which are common to several other outbreak diseases and often even occur when there is no outbreak. The PCTS ordinarily detects that some outbreak is occurring better than it detects that a specific outbreak is occurring, and this should hold even more for a difficult disease such as influenza. The SaTScan MT does not simultaneously investigate different outbreaks; it only looks at counts of observable events. As discussed above and in more detail in appendix A, SaTScanMT performed poorly when we gave it counts of all 20 chief complaints that are manifestations of influenza, and performed much better when we only gave it the top three chief complaints. We only showed the results for its better performance. The PCTS simultaneously models 12 different outbreaks and reports both the overall probability of an outbreak and the probability of an outbreak of each outbreak disease. If we had compared PCTS's ability to detect any one outbreak to SaTScanMT or if we had optimized PCTS to only investigate an influenza outbreak by looking only at the top three chief complaints, PCTS may have faired better in the comparisons.
It is reasonable to ask whether SaTScanMT may have been at a disadvantage in the evaluation, due to it modeling circular outbreak regions. Two points attenuate concern. First, each outbreak we generated consisted of counts occurring in zip codes whose centroids were in a simulated outbreak rectangle. Thus, these outbreak subregions were not strictly rectangular in shape. Second, we used rectangles that were 2×1 cells, 2×2 cells, and 3×2 cells, which are not highly elongated shapes. These two points mean that the simulated outbreaks consisted of relatively compact subregions that were not restricted to be rectangular in shape.
Our comparisons were done using simulated outbreaks which showed methodical daily fluctuations. A more realistic comparison would be to evaluate the performance of the systems using real outbreaks. Unfortunately, there are not that many real outbreaks with available data.
Being Bayesian and allowing patientspecific modeling imparts several advantages to the BNST framework. Since it is Bayesian we can incorporate both expert knowledge and knowledge learned from data into a BNST system. A method such as SaTScan bases its analysis only on the current data and is not able to take advantage of prior knowledge. The Bayesian framework, however, has a price: to properly use the framework we need to devote considerable effort to obtaining and representing prior knowledge. This knowledge/belief could change over time and from one region to another, and thus be hard to manage and maintain. On the other hand, a datadriven system, such as SaTScan, can readily be applied in many contexts without modification. Thus, both approaches have advantages and disadvantages. Nonetheless, we believe it may be possible to significantly improve methods for acquiring, managing, and maintaining prior knowledge about disease outbreaks within a Bayesian framework, and we believe this topic to be a promising area in need of considerable additional research.
Footnotes

Funding This research was supported by grant IIS0325581 from the National Science Foundation.

Competing interests None.

Ethics approval This study was conducted with the approval of the University of Pittsburgh.

Provenance and peer review Not commissioned; externally peer reviewed.

↵i Classic SaTScan looks for clusters of an event in circular subregions of the monitored region. Its space–time extension looks for clusters in three dimensions, where the 3rd dimension is time. So it will investigate 1day cylinders, 2day cylinders, etc. This system would not be considered a temporal system according to our definition in the introduction because it does not investigate how patterns change over time. However, it is not a classical nontemporal system either because it does not simply look at one unit of time. It can be viewed as a system that simultaneously investigates clusters using various units of time (1day unit, 2day unit, etc).