SIR model

As described above, the SIR model is a commonly used compartmental model used for infectious disease outbreaks. Because of the large literature base describing both the history and use of SIR models (e.g., see Keeling and Rohani20), we present a somewhat abbreviated description here.

People are assigned to three compartments based on their disease status at time t (see Fig. 1). The number of people in each compartment varies with time as the outbreak progresses, but the overall population in the model stays constant. Susceptibles (S) are those that are at risk of infection. Infected (I) are individuals experiencing the illness, and recovered persons (R) have completed infection and are now immune to the disease, or died as a result of the infection. Movement between compartments is described by the following system of equations:

Transitions between compartments are described by two parameters, γ and β. γ is the reciprocal of ψ, the infectious period (see equation (5)). The infectious period is the interval of time during which an infected individual can transmit the disease. Time in our model is represented in days. Measles, for example, has an infectious period of approximately 8 days; individuals can transmit the disease from approximately 4 days before rash onset until 4 days after21. γ controls the transition from infectious to recovered (see equations (2) and (3)). β is commonly referred to as the “force of infection” because it describes how quickly a disease can move through a population. It is the product of γ and the reproductive number (R0), and controls the transition from susceptible to infectious (see equations (1) and (2)). The reproductive number, R0, is the number of secondary infections per primary infection (see equation (4)). Of note, R0 can be estimated a number of ways. Obadia et al.22 describe an overview of several common methods including exponential growth, maximum likelihood, sequential bayesian and a time-dependent method. Each method makes slightly different assumptions, and can result in different reproductive values. Values should be calculated and interpreted with these caveats in mind23.

SIR augmentations

To expand the above model and introduce a scenario where a control measure is applied to an outbreak, we introduce two additional parameters:λ or control measure effectiveness describes what fraction of individuals are removed from the susceptible population at each time point. For example, λ = 0 is a control that is completely ineffective and removes no individuals from the susceptible population during each time interval. Conversely, λ = 1 is a control measure that is 100% effective, or removes all susceptible individuals in one time interval. A more realistic value might be λ = 0.01, which would describe an intervention that removes 1% of the susceptible population during each time interval. A more detailed description of the appropriate interpretation of this parameter is included below.τ or control start is the time unit (interpreted as days throughout this analysis) on which the control measure begins. For the purposes of the simulations presented here, it is assumed that the control is implemented on day τ and continued for the remainder of the outbreak.

To apply a control measure, equations (1) through (3) are modified such that if time (t) is ≥τ, a control measure with λ effectiveness is applied at each time step (see equations (6), (7), (8)). To denote the “controlled” environment the subscript T is added.

Within this implementation, λ is the fraction of the susceptible population removed at each time point. Operationally, this describes a control measure that eliminates the possibility of infection. For example, λ = 0.1 and τ = 5 indicates a scenario where 10% of the susceptible population is removed every day on and after the 5th day. This would describe an intervention like vaccination or quarantine. However, it does not describe the type of control measure that reduces infectivity, changes human behavior in a way that affects population density (e.g., staying home from work/school), or does not confer immunity to the infection (e.g., hand washing). Further, the control measure is continuously applied at each time point after initiation (e.g., 10% of the susceptible population are vaccinated each day for the remainder of the outbreak). This assumes that the control is implemented with the same effectiveness throughout each time interval. These assumptions simplify the addition of control measures to the SIR model, but it is relatively straightforward to modify this implementation in the future to a more realistic scenario. These equations are also limited to describing one control measure. It would be comparatively straightforward to substitute λ with a vector of parameters as opposed to a single term, in order to describe multiple control types and effectivenesses.

Assumptions

There are a number of assumptions inherent in SIR models24. As a result, SIR models are scoped to diseases that meet the following criteria:The disease is transmitted person-to-person. This means the disease is not transmitted via vector, or environmental component like water or food.Disease transmission can be described via homogenous mixing. This means that if a group of susceptible people interact with an ill person, all susceptible person are equally likely to acquire infection. Of importance, a majority of infections do not meet this assumption. For example, sexually transmitted infections are not transmitted with equal probability among the entire susceptible population. Even in the case of airborne infections like measles this assumption ignores individuals’ specific immune responses (e.g., immunocompromised individuals are treated the same as healthy individuals).The disease confers immunity. This means that once an individual has recovered they cannot get contract the illness again during the same outbreak. Diseases with very short-term (or no) immunity are commonly modeled with SI models.The disease’s incubation period is relatively short. Diseases with long incubation periods should include the “exposed” category and can be modeled with a SEIR model25.The disease is an acute illness (i.e., infected individuals recover or die). This excludes chronic diseases like hepatitis.

κ - Comparing controlled and uncontrolled outbreaks

In order to compare a controlled outbreak to its counterfactual outbreak, we introduce the outcome measurement, κ. It describes the ratio of the cumulative number infected in a controlled outbreak to the cumulative infected in an uncontrolled outbreak. The cumulative number infected throughout the outbreak at time t is equal to the number recovered at time t plus the number infectious at that time (see equation (9)). At the end of an outbreak (t = end) there are no individuals left in the infected category and equation (9) reduces to equation (10).

κ = 1 means that the controlled and uncontrolled outbreaks are identical, or that the control measure had no effect. κ > 1 means that the controlled outbreak had more infected than the uncontrolled outbreak (or that the control measure had a detrimental effect). κ < 1 indicates that the control measure reduced the number of infected persons. For example, κ = 0.01 is interpreted to mean that the controlled outbreak was 1% as large as the uncontrolled outbreak. κ = 0 would describe a scenario where the control measure stopped the outbreak from occurring at all.

Sensitivity analysis

To assess the effect of each parameter on the model, we performed a sensitivity analysis. We varied γ, β, λ and τ within specific ranges (see Table 1) at random for 10,000 trials. During each trial, we randomly picked the value of each input parameter from the specified range, ran the model using those values, and recorded the outcomes. Here, outcomes of interest are the number infected in a controlled scenario, number infected in an uncontrolled scenario and the related κ. We then analyzed each parameter’s relative impact on these outcomes. Because γ and β vary together and can be described simultaneously using R0 (see equation (4)), we also analyzed how the change in R0 affects the outcomes. Because λ and τ only exist as parameters in controlled outbreaks, the controlled scenario is the only outcome considered.

Application to three diseases

We applied this model to measles, norovirus, and influenza. These diseases were selected because they are of public health interest, and because they meet the requirements described in the assumptions section above. Of note, norovirus can be transmitted both through food and via person-to-person. Outbreaks described here are solely person-to-person transmitted outbreaks.

Outbreaks were simulated using standard parameter ranges for the three diseases. These ranges were identified based on literature values reported for parameters, identified via searches using Google Scholar and PubMed. Search terms included “[disease name] + infectious period”, “[disease name] + contact rate”, “[disease name] + force of infection”, and “[disease name] + reproductive number”. We were consistently unable to find reported literature values for β and instead used equation (5) to find the maximum and minimum β values for each disease, given their infectious period and R0 values.

Rather than attempting to identify control parameters (λ and τ) based on literature values, we intentionally selected a broad range of possible parameters to observe the effects of a broad number of controls. To account for the logistic work that precedes control initiation (identifying the outbreak, laboratory confirmation, mobilizing resources etc.) we selected a minimum control start of 3 days. We then selected upper bounds based on typical outbreak progression for each disease. Measles and influenza can both result in outbreaks that are several weeks to months long. Thus, we selected one month (30 days) as the upper bound of τ. Conversely, norovirus outbreaks are typically much shorter due to their short infectious period. We thus limited the latest possible control start to 7 days. All λ values were varied between 0.005 and 0.3 (0.5% to 30%). Table 2 describes the ranges used for each parameter and disease.

Development of a web-based tool

To make this model available for decision making, we developed a web-based application that allows a user to enter parameter ranges for their disease, initial population variables and control information. It is a Django application26 that uses HighCharts27 for visualization. All code for the SIR model was written in Python 3.528.

To make the application more user friendly, two small modifications were made to γ and β such that they could be expressed as the infectious period (see equation (11)) and R0 (see equation (12)). These terms are more familiar to public health individuals than γ and β, which are commonly used by modelers.

Sensitivity analysis

Figure 2 shows the results of the sensitivity analysis. Plots show each parameter with respect to the cumulative number infected (in either controlled or uncontrolled outbreaks), and are colored by the range of the associated κ score. γ shows a strong negative correlation with the cumulative number infected (i.e., larger γ values (shorter infectious periods) result in smaller outbreaks) and R0 shows a positive association with the number of persons affected (i.e., more quickly moving outbreaks infect more people). Both correspond to our intuition about outbreak progression—diseases with short infectious periods infect fewer individuals because the disease is infectious for less time and can thus spread to fewer persons. Conversely, large R0 values correspond to situations where the host can infect many other people, thus resulting in much larger outbreaks.

Results further indicate that β, λ and τ affect overall outbreak size substantially less. There is a weak association between β and outbreak size in controlled outbreaks, as well as a possible association between β and κ, but essentially no association between λ and outbreak size or λ and κ.

Disease application

Figure 3 describes a number of outcomes for measles, norovirus and influenza outbreaks based on literature parameter values (see Table 2) and the resulting κ. Patterns are recognizable both within and across diseases. Within norovirus, for example, it is evident that there are several combinations of outbreaks that produce no outbreak (here defined as fewer than 2 cases total—see gray dots). In particular, as γ approaches larger values (>0.3) individuals progress from infected to recovered too quickly to pass the illness to others. This is consistent with many point source norovirus outbreaks where the number of secondary cases is generally quite small.

Conversely, the vast majority of measles outbreaks simulated are essentially unaffected by any control measure tested (see dark blue dots that indicate controlled and uncontrolled outbreaks are ≥95% similar). Within a given cross-section of outbreak parameters, the τ value (control measure start) affects the resulting κ more than λ indicating that, under this model, implementing a control measure early is more important than implementing the most effective control measure. This is a potentially important finding for decision support and is an intriguing path for further investigation. It is also consistent with our sensitivity analysis findings (see Fig. 2).

For each disease, we identify the latest possible control start and the least effective intervention that could still result in κ values of 0.1 and 0.01 (see Table 3). Interestingly, if control measures have λs that are large enough (minimum 5%), or control starts that are early enough (6—30 days) they can consistently produce dramatic reductions in outbreak load. By examining these values in various parameter ranges, we can begin to see the effects of parameter ranges on κ results. In Table 3, we consider (1) the entire range, (2) the lower 50th percentile of both γ and β values or (3) the upper 50th percentile of both γ and β values. There is a strong distinction between outbreaks with large values (upper 50th percentile) compared to outbreaks with small values. For example, in measles outbreaks, although it is possible to reduce the outbreak to 10% of the uncontrolled outbreak by beginning a control measure 29 days after outbreak onset, dividing the outbreaks into upper and lower 50th percentiles indicates this is actually only possible if both the β and γ parameters fall into the lower 50th percentile and the control is at least 19% effective. Similar, but less dramatic trends are evident in norovirus and influenza.

These examples illustrate the possible use of models like this for decision support. By aggregating several models, it is possible to identify general trends that are relevant for intervention decisions.

User interface

To facilitate widespread use of the model, a user interface was developed. Figure 4 shows an example of user data and application output. Output includes the smallest and largest SIR curves possible based on user input, as well as the effect of user control measures on those curves. An additional three graphs describe how outputs (κ) change with changing R0, β and γ values, and describe the minimum required control effectiveness to reduce the outbreak ten times. Visualizing the data multiple ways allows the user to see different aspects of the same outbreak, and facilitates enhanced decision making capabilities. In the example presented, the second graph (titled ‘Intervention analysis’) indicates that changing the control start date by a few days in either direction minimally impacts the resulting κ score, regardless of the R0 value. However, changing control effectiveness from 0.01 to 0.1 dramatically increases κ.

Discussion

We conducted this study to evaluate the feasibility of a simplified approach to decision support for control measure intervention. The larger goal is the development of methodologies that improve collaboration between public health and modeling communities which in turn can facilitate optimum disease response during outbreaks.

Our results suggest that it is reasonable to simultaneously explore the impact of a variety of control measures on outbreak progression in a number of scenarios using simple SIR models. We do so while using a range of outbreak parameters, to understand the effects of both outbreak parameters and control efforts on outbreak progression. We show that, in this model, γ affects the outbreak outcomes most substantially. We further provide a way to measure the relative success of outbreak control using the κ value. We lastly present one possible method to promote adoption of models in the public health community by presenting a simple, web-based interface for the model.

Compartmental models are in many ways preferable to agent-based models because of their simplicity and small computational requirements. However the use of SIR models necessitates adoption of several assumptions that rarely exist in real world outbreaks. Of particular concern is the assumption of homogenous mixing. However, there are numerous ways to improve upon the simple model described here. Other compartmental models (e.g., SEIR, SIS, SI etc.) and methods exist to reduce or modify these assumptions and expand the breadth of applicable disease. For example, Hethcote et al. describe a method to allow non-homogenous mixing within compartmental models for sexually transmitted infections29.

Another possibility is the addition of an underlying network to improve model behavior. For example, Meyers et al.30 found that coupling a compartmental model with an underlying social network allowed them to explain aspects of real SARS epidemics (used for illustration purposes in the introduction) better, than the compartmental models alone. Related possibilities include additions of spatial networks in addition to or instead of social networks31. It is possible that various networks are suited to particular diseases or disease scenarios. These subtleties offer opportunities for extensive further research. Importantly, many variations of these models continue to maintain comparatively low computational requirements, while allowing for a better representation of reality.

Another, related focus should be continued research on the impact of parameter selection on model outputs. Here, we describe an approach where parameters are assumed to be known (or estimate-able), and the range of possible outbreaks are treated as an outcome. In contrast, Wearing et al. estimate parameters by finding the best simulated outbreak fit to real data and identifying the parameters that give rise to that simulation32. Their results caution that model selection (e.g., the type of compartmental model used) can dramatically affect the resulting reproductive ratio estimated. Our results indicate that, in addition, variations in reproductive ratio produce exceedingly different outbreaks. Meyers et al.30 also note the large impact parameter selection and network structure can have on resulting simulated outbreaks.

One obvious possible improvement is in the continued production and extension/refinement of tools to utilize compartmental models and afford control measure simulation quickly and easily. The tool presented here, for example, might be enhanced by adding new compartmental models, refining control definitions, improving visualization, and investigating addition of network structures. Deployment of these systems as open-source code, or freely available web applications should be encouraged.

Overall, there is a clear need in the field to better understand outbreak parameters, model selection, underlying model assumptions, and the ways that these apply to real world scenarios. While SIR models have been used extensively for many years, there has been little work done on validating their output. We thus propose thoughtful validation of SIR models as an important next step. One method to accomplish this is to compare the outputs of validated agent-based models to outbreaks produced using compartmental models. Previously validated agent-based models simulating disease outbreak progression on a fine tuned scale already exist (e.g., EpiSimS3334) and would provide good candidates for this research.

Such a validation would accomplish several things. It would (1) validate the counterfactual approach, (2) provide additional data to describe when compartmental models are appropriate approximations of real world outbreaks and (3) provide data to describe situations where the compartmental models do not match real world outbreaks and should not be used for decision support.

Additional Information

How to cite this article: Daughton, A. R. et al. An approach to and web-based tool for infectious disease outbreak intervention analysis. Sci. Rep.

7, 46076; doi: 10.1038/srep46076 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.