menu ☰
menu ˟

Predicting the hand, foot, and mouth disease incidence using search engine query data and climate variables: an ecological study in Guangdong, China

07 Oct 2017


Hand, foot, and mouth disease (HFMD) has caused a substantial burden in China, especially in Guangdong Province. Based on the enhanced surveillance system, we aimed to explore whether the addition of temperate and search engine query data improves the risk prediction of HFMD.


Ecological study.

Setting and participants

Information on the confirmed cases of HFMD, climate parameters and search engine query logs was collected. A total of 1.36 million HFMD cases were identified from the surveillance system during 2011–2014. Analyses were conducted at aggregate level and no confidential information was involved.

Outcome measures

A seasonal autoregressive integrated moving average (ARIMA) model with external variables (ARIMAX) was used to predict the HFMD incidence from 2011 to 2014, taking into account temperature and search engine query data (Baidu Index, BDI). Statistics of goodness-of-fit and precision of prediction were used to compare models (1) based on surveillance data only, and with the addition of (2) temperature, (3) BDI, and (4) both temperature and BDI.


A high correlation between HFMD incidence and BDI (r=0.794, p<0.001) or temperature (r=0.657, p<0.001) was observed using both time series plot and correlation matrix. A linear effect of BDI (without lag) and non-linear effect of temperature (1 week lag) on HFMD incidence were found in a distributed lag non-linear model. Compared with the model based on surveillance data only, the ARIMAX model including BDI reached the best goodness-of-fit with an Akaike information criterion (AIC) value of –345.332, whereas the model including both BDI and temperature had the most accurate prediction in terms of the mean absolute percentage error (MAPE) of 101.745%.


An ARIMAX model incorporating search engine query data significantly improved the prediction of HFMD. Further studies are warranted to examine whether including search engine query data also improves the prediction of other infectious diseases in other settings.

Click here to view the full article which appeared in BMJ Open