Mosquito control: targeting breeding sites using street view images
In addition to personal protection against mosquito bites and the chemical control of mosquitoes and their larvae, the removal of common breeding sites is one of the most important and effective steps in controlling mosquitoes and the diseases they transmit. Recent research has sought to use geotagged images obtained through Google's Street View to map the most common types of open containers in order to facilitate and accelerate the detection of these and generate a decision support tool. In this Infectious Thoughts interview, we speak to Pr. Peter Haddawy of Mahidol University's Faculty of ICT and the University of Bremen's Spatial Cognition Center, about the benefits and cost-effectiveness of developing such tools as part of a broader vector- and disease-control effort, the potential to develop the model by improving the integration of data streams, and the applications of this approach to other diseases or issues where the environment plays an important role.
What have been some of the main advantages of using Google Street View images to detect likely mosquito breeding sites?
Using this approach, we are able to provide information on outdoor container counts on a scale not possible through manual surveys and at very low cost. There is a minimal cost for obtaining the Google Street View images. For the processing, a high-end PC with fast graphics card and a good amount of storage is the only requirement.
Your approach relies on the detection of eight of the most common containers by convolutional neural network transfer learning - how does this approach ensure that containers are accurately identified and classified?
We specifically evaluated the detection accuracy in our study. This is the first thing you need to look at because if the detection accuracy is not good, you can stop right there. For the binary problem of determining whether an image contains a breeding site or not, the F-score is 0.91. For the problem of detecting and then classifying into one of the eight container types, the F-scores range from a low of 0.37 for bins to a high of 0.92 for old tires. Other than bins, the F-scores are all above 0.8. The value for bins is a bit low because they get confused with buckets.
How easily could this approach be scaled up to different countries or continents?
Scaling up to larger regions is partly a matter of computation, which scales linearly with the number of images. The more significant issue is deciding on the types of containers to detect. The prevalence and importance of different types of containers as potential breeding sites varies from country to country and even among different regions in a country. So one would need to first identify which are the most important containers in the area to be studied. If one is lucky, the containers are already in the COCO dataset. If not, one would need to train the neural network to recognize these containers. This involves collecting and labeling enough images of the containers. We were able to use only about 500 training examples per container category because of the use of transfer learning on the network originally trained on the COCO dataset. I would expect this to be the case for a large range of container types.
The container counts from the object recognition algorithm in your research are well aligned with the counts from manual surveys - what are some of the remaining gaps in your approach?
It was very difficult to find the data to validate our approach. It would be nice if we had more manual count data to compare to, as well as more data from manual larval surveys. Beyond this, we are currently working on using the container counts to predict prevalence of dengue by combining with other relevant factors. We are also looking at how the counts can be combined with other data in order to predict vector counts. The pipeline we have developed can be applied to any geo-tagged images. So we would like to see how well the approach would work with drone images, as well as images from other street view apps like Mapillary and Open Street Cam.
Are there specific technologies or partnerships which you would like to see developed to further advance your work?
Partnerships that would give us more access to more recent geo-tagged images would be very helpful, as would access to more data on manual container and larval counts. We believe that our approach could be of great value in helping direct dengue control efforts. This would require developing a capability to integrate data streams in real time as well as developing a robust decision support dashboard. For this a suitable industrial partner would be ideal.
How easily could this work be integrated in outbreak prediction models?
As I mentioned above, we are currently working on this and starting to obtain some encouraging results. Specifically, we are working on quantifying the added value of the container counts in predicting dengue prevalence.
Overall, what could be some of the limits of this approach? Would you envisage any issues such as data privacy in future?
Coverage of the images is one issue. Google street view images cover regions along roads and have better coverage of urban areas and larger roads. This means we don’t cover private areas, empty lots, and indoor areas. Despite this limitation, our results show that the approach does provide useful information. Another limitation is the freshness of the data. The Google street view images are 2-3 years old. This is ideal for aligning with manual survey data for evaluation of the approach. It is likely that container counts in a region such as a sub-district don’t change much from year to year. This should be empirically evaluated. It would, of course, also be useful to have access to up-to-date data, such as drone images.
I don’t see privacy as an issue with the street view data but it could be with the use of drone data.
Would you envisage the development of this approach to entirely different sectors or research?
Yes certainly. High resolution imagery like Google street view provides a wealth of information about the environment. So it could be used to provide information about risk of diseases in which environment plays an important role, whether infectious or other. In fact, we currently have a project in which we are tracking people’s movement and combining this with information about risk of regions they move through in order to quantify the risk of exposure of individuals to malaria. Our approach could also be used for applications outside of health like poverty mapping. The inspiration for this current work actually came when I heard my colleague Johannes Schoening give a talk on analyzing data from Google street view to create maps so that autonomous vehicles can find scenic routes for people.
For further information:
Large scale detailed mapping of dengue vector breeding sites using street view images Peter Haddawy ,Poom Wettayakorn,Boonpakorn Nonthaleerak,Myat Su Yin,Anuwat Wiratsudakul,Johannes Schöning,Yongjua Laosiritaworn,Klestia Balla,Sirinut Euaungkanakul,Papichaya Quengdaeng,Kittipop Choknitipakin,Siripong Traivijitkhun,Benyarut Erawan,Thansuda Kraisang
Published: July 29, 2019
Professor Haddawy received a BA in Mathematics from Pomona College in 1981 and MSc and PhD degrees in Computer Science from the University of Illinois-Urbana in 1986 and 1991, respectively. He was tenured Associate Professor in the Department of Electrical Engineering and Computer Science at the University of Wisconsin-Milwaukee, and Director of the Decision Systems and Artificial Intelligence Laboratory there through 2002. Subsequently, he served as Professor of Computer Science and Information Management at the Asian Institute of Technology (AIT) through 2010 and the Vice President for Academic Affairs from 2005 to 2010. He served in the United Nations as Director of UNU-IIST from 2010 through 2013. Professor Haddawy has been a Fulbright Fellow, Hanse-Wissenschaftskolleg Fellow, Avery Brundage Scholar, and Shell Oil Company Fellow. His research falls broadly in the areas of Artificial Intelligence, Medical Informatics, and Scientometrics and he has published over 130 refereed papers with his work widely cited. His research in Artificial Intelligence has concentrated on the use of decision-theoretic principles to build intelligent systems and he has conducted seminal work in the areas of decision-theoretic planning and probability logic. His current work focuses on intelligent medical training systems and application of AI techniques to modeling of vector-borne disease. In the area of Scientometrics Prof. Haddawy has focused on development of novel analytical techniques motivated by and applied to practical policy issues. He currently holds a professorship in the Faculty of ICT at Mahidol University in Thailand where he is Director of the Mahidol-Bremen Medical Informatics Research Unit and Deputy Dean for Research. He also holds and Honorary Professorship at the University of Bremen in Germany.