Modeling dengue vector population with earth observation data and a generalized linear model
journal contributionposted on 08.01.2021, 22:11 by Oladimeji Mudele, Alejandro Frery, Lucas FR Zanandrez, Alvaro E Eiras, Paolo Gamba
Mosquitoes propagate many human diseases, some widespread and with no vaccines. The Ae. aegypti mosquito vector transmits Zika, Chikungunya, and Dengue viruses. Effective public health interventions to control the spread of these diseases and protect the population require models that explain the core environmental drivers of the vector population. Field campaigns are expensive, and data from meteorological sites that feed models with the required environmental data often lack detail. As a consequence, we explore temporal modeling of the population of Ae. aegypti mosquito vector species and environmental conditions- temperature, moisture, precipitation, and vegetation- have been shown to have significant effects. We use earth observation (EO) data as our source for estimating these biotic and abiotic environmental variables based on proxy features, namely: Normalized difference vegetation index, Normalized difference water index, Precipitation, and Land surface temperature. We obtained our response variable from field-collected mosquito population measured weekly using 791 mosquito traps in Vila Velha city, Brazil, for 36 weeks in 2017, and 40 weeks in 2018. Recent similar studies have used machine learning (ML) techniques for this task. However, these techniques are neither intuitive nor explainable from an operational point of view. As a result, we use a Generalized Linear Model (GLM) to model this relationship due to its fitness for count response variable modeling, its interpretability, and the ability to visualize the confidence intervals for all inferences. Also, to improve our model, we use the Akaike Information Criterion to select the most informative environmental features. Finally, we show how to improve the quality of the model by weighting our GLM. Our resulting weighted GLM compares well in quality with ML techniques: Random Forest and Support Vector Machines. These results provide an advancement with regards to qualitative and explainable epidemiological risk modeling in urban environments.