Background

Middle East respiratory syndrome (MERS) is a viral infectious disease caused by the MERS associated coronavirus (MERS-CoV), which was first identified in September 2012. Infections with this virus have been reported from countries in the Middle East where dromedary camels have been found to be an important reservoir of the coronavirus. The case fatality ratio of MERS among confirmed cases has been estimated at about 40 %. Human-to-human transmission has been facilitated in healthcare settings with the contribution of hospital-based transmission of MERS estimated at about 80 % using an epidemic model. The pandemic risk that MERS poses is limited by the fact that R0 (the basic reproduction number, i.e., the average number of secondary cases produced by a single primary case) is estimated at below 1 and that MERS outbreaks have been greatly confined to the healthcare environment until now [4, 5]. Antibodies against MERS-CoV have been found among both dromedary camel populations and camel-exposed humans in Saudi Arabia and other countries of the Middle East [6, 7], which have often been the source of multiple MERS case importations around the world. As highlighted by the recent MERS outbreak in the Republic of Korea since May 2015 [8, 9] together with one case importation with no secondary cases into Thailand in June 2015, the risk for global spread of MERS is sufficiently serious to warrant the need to find ways to assess the risk of importing MERS case around the world, because those single importations could potentially escalate into regional outbreaks that could, in turn, lead to a serious damage to economic and public health activities.

To support risk assessment practice on potential MERS outbreaks, it could be helpful to pre-emptively characterize the risk of importing MERS case across countries. By estimating the risk of MERS importations, one can comparatively understand the country-specific risk, and such understanding may help raise situation awareness among the general public, facilitating preventive action planning. In this context, a metapopulation epidemic model that uses the airline transportation network has been employed for predicting the global spread of various emerging infectious diseases, including severe acute respiratory syndrome (SARS), the influenza pandemic H1N1-2009 and Ebola virus disease [12–14]. These transmission models have not only helped study epidemic scenarios but also estimated the effectiveness of specific countermeasures targeting travelers, e.g., entry and exit screenings and travel restrictions [15, 16]. While the metapopulation epidemic model is simple in its structure, epidemic forecasting by fitting the model to empirical data entails optimization procedures that are often computationally intensive. Moreover, while a few published studies have already investigated the risk of MERS associated with mass gatherings in Saudi Arabia [17, 18], the distribution of secondary cases per single primary case is highly over-dispersed [4, 11] with the overall R0 falling below 1. Hence, the metapopulation type models that have been used to evaluate epidemics of communicable diseases (e.g., R0 > 1) may not be directly and immediately applied to model the global spread of MERS.

While metapopulation models are useful for capturing various aspects of global transmission dynamics, a simple yet tractable prediction model to carry out fast risk assessment by public health authorities is needed. Such a tool might just quantify the basic information, e.g. (i) the percentage of importing MERS case at a point in time and (ii) when the importation of MERS case is expected. The present study aims to devise and apply a novel statistical model to predict the risk of MERS case importations by country. We assess the predictive performance of our approach and use it to identify countries at high risk of MERS importations.

Empirical datasets

In order to quantify our statistical model, we used data on imported laboratory-confirmed cases of MERS as of 26 June 2015 (the latest date on which our data analysis was conducted), especially focusing on the date on which the first diagnosed case arrived in each importing country. The date of entry of imported MERS cases is hereafter referred to as the arrival time and the corresponding information was retrieved from secondary data sources including the European Centre for Disease Prevention and Control (ECDC) and the World Health Organization. Since it was sometimes difficult to determine if a case was the result of an importation event instead of local transmission (e.g., spillover event or exposure to an undiagnosed case), original case reports were also tracked, especially among diagnosed cases in Middle East countries [21–23]. Qatar and the Kingdom of Saudi Arabia were excluded from countries at risk of importation, because indigenous cases with the history of exposure to dromedary camels have been recurrently reported after the identification of the first case in 2012.

In addition to the arrival time of MERS case importations; three pieces of further information were retrieved. First, weekly incident counts of MERS in Saudi Arabia were used to mirror the force of infection among travelers. Second, the number of flight routes between pairs of countries was obtained from the airline transportation network data. The total number of flight routes between each pair of countries has an approximate dimension of 3 times 4,600 (or with 230 nodes and 4,600 edges) and was obtained from the Global Flights Network derived from the OpenFlights database as on 10 November 2014. Third, dichotomous data to identify the major religion of each country which is in common with Saudi Arabia and Qatar was obtained from the literature: a country in which more than 30 % of the population is Muslim was defined as a Muslim majority country.

Building risk models

Here we describe the proposed model aimed to predict the risk of importation in each country. Let Fj,t be the cumulative distribution function representing the probability that MERS has already been imported to country j by discrete day t. The day t = 0 corresponds to the date of the illness onset of first identified MERS case, and throughout the present study, we set 3 September 2015 as day zero (i.e. the date on which the initially identified case experiences symptoms). Although a few earlier cases were confirmed as MERS by inspecting laboratory data a number of days after their dates of death, dates of illness onset among deceased cases were unavailable and had to be discarded when calculating the arrival time. Using Fj,t, the probability that a country j has not yet imported MERS by day t is1\documentclass[12pt]{minimal}

\usepackage{amsmath}

\usepackage{wasysym}

\usepackage{amsfonts}

\usepackage{amssymb}

\usepackage{amsbsy}

\usepackage{mathrsfs}

\usepackage{upgreek}

\setlength{\oddsidemargin}{-69pt}

\begin{document}$$ {S}_{j,t}=1-{F}_{j,t}. $$\end{document}Sj,t=1−Fj,t.

The daily risk of importing MERS in country j on day t is defined as2\documentclass[12pt]{minimal}

\usepackage{amsmath}

\usepackage{wasysym}

\usepackage{amsfonts}

\usepackage{amssymb}

\usepackage{amsbsy}

\usepackage{mathrsfs}

\usepackage{upgreek}

\setlength{\oddsidemargin}{-69pt}

\begin{document}$$ {\lambda}_{j,t}=\frac{F_{j,t+1}-{F}_{j,t}}{S_{j,t}}. $$\end{document}λj,t=Fj,t+1−Fj,tSj,t.

Thus, we have3\documentclass[12pt]{minimal}

\usepackage{amsmath}

\usepackage{wasysym}

\usepackage{amsfonts}

\usepackage{amssymb}

\usepackage{amsbsy}

\usepackage{mathrsfs}

\usepackage{upgreek}

\setlength{\oddsidemargin}{-69pt}

\begin{document}$$ {S}_{j,t}={S}_{j,0}{\displaystyle {\prod}_{k=0}^{k=t-1}\left(1-{\lambda}_{j,k}\right)}. $$\end{document}Sj,t=Sj,0∏k=0k=t−11−λj,k.

We parameterized the daily risk λ by examining the statistical performance of different types of model parameterizations. In all models that we examined, we use the so-called “effective distance”, initially proposed by Brockmann and Helbing. The metric is derived from the airline transportation network, originally based on itinerary data, by using the transition matrix and length of paths between countries. The effective length of a path {n1, n1, ⋯, nL} is given by4\documentclass[12pt]{minimal}

\usepackage{amsmath}

\usepackage{wasysym}

\usepackage{amsfonts}

\usepackage{amssymb}

\usepackage{amsbsy}

\usepackage{mathrsfs}

\usepackage{upgreek}

\setlength{\oddsidemargin}{-69pt}

\begin{document}$$ L- \log {\displaystyle {\prod}_{k=1}^{L-1}{P_{n_{k+1}}}_{n_k}}, $$\end{document}L−log∏k=1L−1Pnk+1nk,

where Pji denotes the conditional probability that an individual that left i moves to j. (Note that ∑jPji = 1). Assuming that the number of passengers is identical among all international flights, the transition matrix is calculated as \documentclass[12pt]{minimal}

\usepackage{amsmath}

\usepackage{wasysym}

\usepackage{amsfonts}

\usepackage{amssymb}

\usepackage{amsbsy}

\usepackage{mathrsfs}

\usepackage{upgreek}

\setlength{\oddsidemargin}{-69pt}

\begin{document}$$ {P}_{ji}=\frac{m_{ji}}{{\displaystyle {\sum}_k}{m}_{ki}} $$\end{document}Pji=mji∑kmki, where mki is the number of direct flights from i to k per unit time derived from open source data. Finally, the effective distance mj of a country j from Saudi Arabia is calculated as the minimum of the effective lengths of all paths that go from Saudi Arabia to the country j. The effective distance, as calculated from the abovementioned process, has been known to exhibit strong linear correlation with the arrival time of SARS and H1N1-2009 across the world.

Assuming that the effective distance is a critical indicator of the risk of disease spread, the simplest model 1 that we examined was parameterized as5\documentclass[12pt]{minimal}

\usepackage{amsmath}

\usepackage{wasysym}

\usepackage{amsfonts}

\usepackage{amssymb}

\usepackage{amsbsy}

\usepackage{mathrsfs}

\usepackage{upgreek}

\setlength{\oddsidemargin}{-69pt}

\begin{document}$$ {\lambda}_{j,t}=\frac{k}{m_j}, $$\end{document}λj,t=kmj,

where k is a constant. Namely, the hazard is an inverse function of the effective distance. As an alternative model, the information of Muslim majority countries is added, labeling corresponding countries at greater risk as compared with other countries, because countries sharing the religion with Saudi Arabia may be at greater risk of exposure to cases (e.g. through Hajj). As a consequence, a linear weight α is given on Muslim majority countries, i.e.,6\documentclass[12pt]{minimal}

\usepackage{amsmath}

\usepackage{wasysym}

\usepackage{amsfonts}

\usepackage{amssymb}

\usepackage{amsbsy}

\usepackage{mathrsfs}

\usepackage{upgreek}

\setlength{\oddsidemargin}{-69pt}

\begin{document}$$ {\lambda}_{j,t}=\left\{\begin{array}{c}\hfill \frac{\upalpha k}{m_j}\kern1.5em \mathrm{if}\ j\ \mathrm{is}\ \mathrm{Muslim}\ \mathrm{country},\hfill \\ {}\hfill \frac{k}{m_j}\kern1.5em \mathrm{otherwise}.\kern5.25em \hfill \end{array}\right. $$\end{document}λj,t=αkmjifjisMuslimcountry,kmjotherwise.

In models 3 and 4, we additionally use the incidence data of MERS in Saudi Arabia over time. It is natural to assume that the daily risk of importation is proportional to the force of infection in Saudi Arabia, and thus, the incidence of MERS, i.e.,7\documentclass[12pt]{minimal}

\usepackage{amsmath}

\usepackage{wasysym}

\usepackage{amsfonts}

\usepackage{amssymb}

\usepackage{amsbsy}

\usepackage{mathrsfs}

\usepackage{upgreek}

\setlength{\oddsidemargin}{-69pt}

\begin{document}$$ {\lambda}_{j,t}=\frac{k}{m_j}{I}_t, $$\end{document}λj,t=kmjIt,

as the model 3, where It is the incidence of MERS in Saudi Arabia on day t. Since the original incidence data were recorded weekly (while our model is written on the daily basis), the weekly incidence was transformed to the daily data assuming uniform distribution of incidences within each week. Model 4 incorporates both of abovementioned factors into the model, i.e.,8\documentclass[12pt]{minimal}

\usepackage{amsmath}

\usepackage{wasysym}

\usepackage{amsfonts}

\usepackage{amssymb}

\usepackage{amsbsy}

\usepackage{mathrsfs}

\usepackage{upgreek}

\setlength{\oddsidemargin}{-69pt}

\begin{document}$$ {\lambda}_{j,t}=\left\{\begin{array}{c}\hfill \frac{\upalpha k{I}_t}{m_j}\kern1.5em \mathrm{if}\ j\ \mathrm{is}\ \mathrm{Muslim}\ \mathrm{country},\hfill \\ {}\hfill \frac{k{I}_t}{m_j}\kern1.5em \mathrm{otherwise}.\kern5.25em \hfill \end{array}\right. $$\end{document}λj,t=αkItmjifjisMuslimcountry,kItmjotherwise.

Statistical estimation and assessment

To estimate model parameters, a maximum likelihood method was employed. For the countries which have already imported MERS by 26 June 2015, we used the arrival time tj to fit the probability mass function of time at which the first importation event occurs, given by the product of \documentclass[12pt]{minimal}

\usepackage{amsmath}

\usepackage{wasysym}

\usepackage{amsfonts}

\usepackage{amssymb}

\usepackage{amsbsy}

\usepackage{mathrsfs}

\usepackage{upgreek}

\setlength{\oddsidemargin}{-69pt}

\begin{document}$$ {\lambda}_{j,{t}_j} $$\end{document}λj,tj and \documentclass[12pt]{minimal}

\usepackage{amsmath}

\usepackage{wasysym}

\usepackage{amsfonts}

\usepackage{amssymb}

\usepackage{amsbsy}

\usepackage{mathrsfs}

\usepackage{upgreek}

\setlength{\oddsidemargin}{-69pt}

\begin{document}$$ {S}_{j,{t}_j} $$\end{document}Sj,tj. Countries that have not imported MERS cases were dealt with as the censored observation. The total likelihood was\documentclass[12pt]{minimal}

\usepackage{amsmath}

\usepackage{wasysym}

\usepackage{amsfonts}

\usepackage{amssymb}

\usepackage{amsbsy}

\usepackage{mathrsfs}

\usepackage{upgreek}

\setlength{\oddsidemargin}{-69pt}

\begin{document}$$ L\left(\boldsymbol{\uptheta}; {\mathbf{t}}_a\right)={\displaystyle {\prod}_{j\in I}{\lambda}_{j,{t}_j}{S}_{j,{t}_j}{\displaystyle {\prod}_{j\in U}{S}_{j,{t}_m},}} $$\end{document}Lθta=∏j∈Iλj,tjSj,tj∏j∈USj,tm,

where I is the set of index of countries which imported MERS at arrival time tj and U is the set of index of countries which are MERS-free by the date of analysis of 26 June 2015 (tm = 1039 days after MERS onset). Assuming that the dichotomous information of Muslim majority country was always available, the penalized likelihood is comparable between models 1 and 2 and also between models 3 and 4. We calculate the Akaike Information Criterion (AIC) for these comparisons.

Once the risk model was quantified, we assessed the diagnostic performance of our model in predicting the risk of importation by employing the receiver-operating characteristic (ROC) curve and measuring the Area under the curve (AUC). For each model, the optimal cut-off value of estimated risk was calculated in predicting the importation as on tm =1039 days using the Youden index, and sensitivity and specificity were estimated. In addition, a prediction of the risk of importation across countries for 3 years since the emergence (t = 1095) was computed for illustration.

Ethical considerations

The present study reanalyzed the publicly available data from ECDC and WHO the latter of which collected the notification data from member state countries which have obtained ethical approval and written consent from patients adhering to the International Health Regulations. The secondary data were de-identified by these organizations in advance of our access. As such, the datasets that we handled do not involve any patients’ data and the present study has been exempted from the ethical approval.

Results

Table 1 lists the countries and their corresponding timing of MERS case importations by 26 June 2015. In total, 24 countries have experienced at least one MERS laboratory-confirmed case importation, with the arrival time ranging from 8 to 1015 days, with a mean of 537 days and standard deviation of 271 days. Figure 1 shows the global distribution of the effective distance from Saudi Arabia. Middle East countries, United States and South Korea appeared to belong to a quartile with the shortest distance from Saudi Arabia.

Based on the arrival time information, all four models were quantified (Table 2). The models 1 and 3 yielded the best predictive power as measured by the AUC. In model 2, the hazard of Muslim majority countries was estimated to be 2.5 times (95 % confidence interval (CI): 1.1, 5.7) greater than that of other countries. Model fit was improved by including the information of Muslim majority countries (because of lower AIC compared to the similar model without including religion information), but AUC values of models with religion were smaller than models that did not incorporate religious majority data. Moreover, inclusion of the MERS incidence data in Saudi Arabia did not improve AUC values compared with models that do not incorporate the incidence data. Sensitivity of all models was 100 % (95 % CI: 88.3, 100.0), while specificity was 79.6 % (95 % CI: 74.0, 85.2) for models without religion and 69.2 % (95 % CI: 62.8, 75.5) for models with religion.

Figure 2 shows the predicted risks of MERS importation using two best models, i.e., the simplest model with the largest AUC value (model 1) and the most detailed model with good fit as informed by AIC (model 4). The distribution of predicted risk was skewed to the right, facilitated the visual identification of countries at high risk of experiencing MERS importation in the right tail (Fig. 2a and b). For models 1 and 4, optimal threshold probability to detect countries with importation was estimated at 11.2 and 6.9 %, respectively. When comparing the ROC curves of the two models, AUC of model 1 was greater than that of model 4, indicating that the diagnostic performance of model 4 was inferior at several cut-off values.

Figure 3 lists the 30 countries at the highest risk of importing MERS cases by 3 September 2015 (i.e., 3 years since the first detection) using the abovementioned two selected models. Among top-listed 30 countries at highest risk, 17 (56.7 %) have already imported at least one MERS case in model 1 and 12 (40.0 %) have experienced importation in model 4. The difference between two models is understood by comparing countries in panels A and B; panel B includes many Muslim majority countries that have not imported MERS cases. While inclusion of the religion has improved the overall goodness of fit (Table 2), AUC values and Fig. 2 demonstrate that the inclusion did not improve the diagnostic performance of the risk model.

Figure 4 shows the results of real-time predictions. There has been no particular time-dependent change in the predictive performance as measured by AUC, but the uncertainty bound (95 % confidence intervals) of AUC has been reduced as a function of calendar time.

Discussion

The present study devised a novel prediction model to quantify the risk of experiencing a MERS case importation for countries around the world by using data on the arrival time of MERS cases, the global airline transportation network, religion, and the incidence data in Saudi Arabia. We have shown that the proposed model can calculate the country-specific probability of importing MERS cases, using the effective distance with or without accounting for information on religion to classify Muslim majority countries and MERS incidence data. The model is fairly simple compared with more sophisticated simulation approaches [14, 17, 18]. However, all of the analyzed models yielded the sensitivity value of 100 % using cut-off value informed by the Youden index. The simplest model (model 1) with effective distance alone exhibited good predictive performance. The data of Muslim majority countries improved goodness-of-fit and the corresponding countries appeared to have been significantly at an elevated risk of importation, but the predictive performance of models 2 and 4 was lowered compared with models 1 and 3, perhaps elevating the risk of several Muslim majority countries that might actually be very distant from Saudi Arabia. Interestingly, incorporating MERS incidence data into models did not improve their diagnostic performance, although the incidence was added to some of our models to increase the precision of predicted probability of importation.

An important message of the present study is that it is feasible to generate estimates of the risk of importation of MERS using a simplistic and tractable approach using the so-called effective distance as part of the hazard function. As the International Health Regulations (IHR) have never recommended the application of any travel or trade restrictions and screening at points of entry [32–34], countries at risk have had to confront the uncertainty associated with the risk of experiencing a case importation, and we believe that the devised model could facilitate prompt risk assessment practice across countries without the need for complex computational approaches. Especially, since R0 for MERS is below 1, which is unable to generate large-scale epidemics in the community, with the majority of reported cases likely arising from exposure with an animal reservoir (i.e. dromedary camels) in Middle East countries, our relatively simple approach may act as a good alternative of models that rest on rich assumptions of the transmission dynamics that require more computational resources. The effective distance fully rests on the airline transportation network data and can offer a simplistic approach to this task of prediction. As a consequence, the model 1 correctly included 17 countries (56.6 %) within the top 30 countries at highest risk of importations and that accounts for 70.8 % of the total 24 countries that have already documented MERS case importations. Our model showed fairly good predictability despite its simplicity. The only weakness that has to be noted was the moderate value of specificity, i.e., the proportion true negatives among the total of condition negatives. Identifying one or more predictor variables that could help further discriminate between true positives and false positives could be the subject of future work. As was shown in the relative hazard estimate of Muslim majority countries, one can also examine if a specific group of countries is at significantly high risk of importation compared with other countries.

Using the proposed model, public health authorities across countries could determine where their own country stands in terms of risk relative to other countries. While the present study used the date of analysis (26 June 2015) and 3 years since emergence (3 September 2015) as the dates of prediction, varying the time interval of prediction can be easily attained. Considering that only 24 among 225 countries (10.7 %) have experienced at least one MERS importation, the risk estimate should be updated as further countries may report MERS importations in the future. As the absolute risk estimates are variable by model parameters (as seen in Fig. 2), the risk assessment practice may focus on relative risks and model performance. Risk estimates for individual countries are available upon request.

Due to the simplicity of our approach, several limitations should be explicitly mentioned. First, the probabilistic model essentially assumes random sampling of infected travelers from the population of Saudi Arabia, regardless of nationality and camel contact behavior. While the heterogeneous factors at individual level are highly influential in determining the risk of importation, all mathematical models have so far not attempted to characterize differences between traveler and non-traveler populations. Second, our airline transportation data was based on the country-to-country number of flights and not the itinerary of individual travelers. Therefore, the precision of the effective distance value may also be limited. Nevertheless, the present study gave more weight on the importance of devising simple yet tractable models with the use of publicly available data. Third, to illustrate our approach, we always used Saudi Arabia as the country of origin, but it is worth mentioning that several importation events were associated with surrounding Middle East countries including Qatar, United Arab Emirates, Jordan, Bahrain and Oman. Fourth, the model was country-specific, while the risk of infection is highly heterogeneous within each country [35–37]. Further refinements should be considered in the future.

Despite the number of limitations noted above, it should be stressed out that the proposed model is reproducible by a number of countries, properly detecting all countries that have already imported MERS. We have demonstrated that it is feasible to develop the risk model that predicts specific countries to experience importation with high sensitivity, and the task was achieved by employing an attractive metric, the effective distance, embedded on the hazard model. We are committed to update the model, incorporating additional useful predictors and helping countries across the world to improve on their risk assessment toolkit [38–42].

Conclusions

We devised a novel statistical model that predicts the risk of importation of MERS case in each country. We collected the arrival time of MERS to each country, dealing with it as a dependent variable to predict. Openly accessible data including the airline transportation network data are used to parameterize a hazard-based risk prediction model. The hazard was assumed as an inverse function of the effective distance calculated from the airline transportation data, and religion and the incidence data of MERS in Saudi Arabia were supplemented to improve the model prediction.

Among the examined models, a model that uses the effective distance only was shown to yield the best predictive performance. Model fit was improved by including country-specific religion (i.e. Muslim majority country), but predictive performance measured by AUC was not improved by accounting for the religion. Our relatively simple statistical model based on the effective distance was found to help predicting the risk of importing MERS at the country level.

Abbreviations

AIC, Akaike Information Criterion; AUC, Area under the curve; CI, Confidence interval; ECDC, European Centers for Disease Prevention and Control; MERS, Middle East respiratory syndrome; MERS-CoV, Middle East respiratory syndrome associated coronavirus; SARS, Severe acute respiratory syndrome; WHO, World Health Organization