Disease searches on Wikipedia could aid in forecasts of public health threats |

Disease searches on Wikipedia could aid in forecasts of public health threats

The Los Angeles Times

Can public health experts tell that an infectious disease outbreak is imminent simply by looking at what people are searching for on Wikipedia? Yes, at least in some cases.

Researchers from Los Alamos National Laboratory were able to make extremely accurate forecasts about the spread of dengue fever in Brazil and flu in the United States, Japan, Poland and Thailand by examining three years’ worth of Wikipedia search data. They also came up with moderately successful predictions of tuberculosis outbreaks in Thailand and China, and of dengue fever’s spread in Thailand.

However, their efforts to anticipate cases of cholera, Ebola, HIV and plague by extrapolating from search data left much to be desired, according to a report published Thursday in the journal PLOS Computational Biology. But the researchers believe their general approach could still work if they use more sophisticated statistics and a more inclusive data set.

Accurate data on the spread of infectious diseases can be culled from a variety of sources. Government agencies typically get it from patient interviews and laboratory test results. Other data sources include calls to 911 lines, emergency room admissions and absences from work or school.

The problem with these methods is that they can be time-consuming and costly. By the time the numbers are crunched, an outbreak may be in full swing.

If you want to stop an outbreak before it starts — and if you want to save lives and money, you certainly do — what you need is a forecast that is accurate and timely. And so the Los Alamos researchers turned to the treasure trove that is Wikipedia.

In addition to the about 30 million articles on topics ranging from quantum foam to the First English Civil War to Kim Kardashian, Wikipedia also collects data on the approximately 850 million search requests it gets each day. In previous studies, researchers have used this publicly available data to predict ticket sales for new movies and the movement of stock prices.

TribLIVE commenting policy

You are solely responsible for your comments and by using you agree to our Terms of Service.

We moderate comments. Our goal is to provide substantive commentary for a general readership. By screening submissions, we provide a space where readers can share intelligent and informed commentary that enhances the quality of our news and information.

While most comments will be posted if they are on-topic and not abusive, moderating decisions are subjective. We will make them as carefully and consistently as we can. Because of the volume of reader comments, we cannot review individual moderation decisions with readers.

We value thoughtful comments representing a range of views that make their point quickly and politely. We make an effort to protect discussions from repeated comments either by the same reader or different readers

We follow the same standards for taste as the daily newspaper. A few things we won't tolerate: personal attacks, obscenity, vulgarity, profanity (including expletives and letters followed by dashes), commercial promotion, impersonations, incoherence, proselytizing and SHOUTING. Don't include URLs to Web sites.

We do not edit comments. They are either approved or deleted. We reserve the right to edit a comment that is quoted or excerpted in an article. In this case, we may fix spelling and punctuation.

We welcome strong opinions and criticism of our work, but we don't want comments to become bogged down with discussions of our policies and we will moderate accordingly.

We appreciate it when readers and people quoted in articles or blog posts point out errors of fact or emphasis and will investigate all assertions. But these suggestions should be sent via e-mail. To avoid distracting other readers, we won't publish comments that suggest a correction. Instead, corrections will be made in a blog post or in an article.