Searching for Outbreaks
Public Health In A Digital Age
SOURCE: Google Flu Trends
Digital technologies are changing the world of public health, and officials are just now exploring the best ways to incorporate these new tools into older systems of disease detection and medical research. Looking ahead, the nationwide switch to digital health records has enormous implications for public health—but not just for the reasons most people are talking about.Over the past 60 years, the Centers for Disease Control and Prevention have developed some of the most reliable systems for tracking disease outbreaks in the world. In addition to labs and other sources, the CDC relies on an extensive network of 2,500 participating doctors around the country to provide information to their headquarters in Atlanta, Georgia. Until the mid-1990’s, this information was mailed in on postcards.
Since then, of course, the digital age has revolutionized the way medical information is collected, analyzed, and shared. Today, data is submitted to the CDC electronically, compiled in databases and published on the Internet. Medical information is so widely accessible on the web that a majority of Americans have now become active participants in their own medical care. And that’s all just the beginning.
Using the Internet to Predict Flu Outbreaks
Months before the recent outbreak of H1N1 influenza (”swine flu”), developers at Google revealed that they had created an ingenious method to predict flu outbreaks by tracking the frequency of search keywords. As it turns out, an increased frequency of searches for a family of words like “flu symptoms” and “runny nose” in a particular region is a reliable indicator that the flu is spreading there. Brilliant.
When this information is graphed by region and compared to data compiled by the CDC, it doesn’t just correlate-it matches almost exactly (see above). The difference is that the method the CDC uses to compile data from clinics and physicians takes much longer. Once people start to feel sick, it takes time to make an appointment to see the doctor, and for a physician to then report that information the CDC. Using search data reveals trends in real time, allowing epidemiologists to identify flu outbreaks two to six weeks faster than by using any other method.
Public health organizations like the CDC and the World Health Organization now use Google Flu Trends, where this information is publicly reported, as an additional tool for disease surveillance here in the United States and around the world. “Information like this can be extremely helpful in early detection,” according to Lynnette Brammer, an epidemiologist in the Influenza Division of the CDC. “We’ll look at Google Flu Trends particularly at the beginning of an outbreak to give us an indication or heads up.”
For public health officials this additional lead-time is incredibly useful, Brammer adds. Hospitals can be notified for “surge planning,” allowing them to be adequately staffed and stocked with medical supplies. This ability to plan ahead for surge capacity is particularly important for hospitals and medical centers aiming to spend limited dollars efficiently in an economic downturn. Early detection also helps tremendously in alerting the public and isolating outbreaks as much as possible.
But if this search data can be used to predict flu outbreaks, what else can it tell us? Actually, all kinds of things. “We’re just at the beginning of incredible new insights,” according to R.J. Pittman, Director of Project Management at Google, “It’s a wide open field.” Using this data to predict flu outbreaks may be just the tip of the iceberg.
When we type a keyword or phrase into Google, we get a list of relevant links with the information we’re looking for. But what most of us don’t realize is that this entire time Google has been has been recording each of these search words along with the IP addresses of the computers doing the search, forming a colossal cache of data-a byproduct of Internet searches. Google uses this highly confidential information primarily to improve its search engine, but other valuable uses are just now being discovered.
Google Flu Trends developed out of Google Trends, launched in 2006, which allows anyone to search for the frequency of keywords and phrases entered into Google since the beginning of 2004. Pittman, who oversees new development projects at Google Labs where Google Trends was designed, explains, “It was born out of engineering tinkering in the labs. There’s a lot of rapid prototyping.” But it’s only when programmers built a simple user interface and a data visualization tool to create graphs of the frequency of the searches over time that they started to understand what they had uncovered. “That’s what led to some ‘Aha!’ moments,” says Pittman.
People can now see what other people in the United States and around the world are searching for. Pittman describes it as “Internet searches turned inside out. We give back to the users the ability to search the searches.” In August 2008, without fanfare, Google pulled the curtain back further on its archived search data with the launch of Google Insights for Search, a much more sophisticated application targeted at a marketing audience, though available free for anyone to use. Now people can mine this data with much more precision and see much more detailed results, broken down by region on a shaded heat map. Only with this improvement did it become possible to use the technology to accurately track flu outbreaks.
Before anyone starts worrying about privacy infringement, Google posts its privacy policy on the Google Insights page and takes care to point out that this data is anonymized and aggregated. They won’t reveal the search history from any individual IP address, though that data is archived for nine months before being anonymized.
What makes Google search data so valuable is that so many of us search online before we do something, as David Leonhardt described in “The Internet Knows What You’ll Do Next” in the New York Times. We often search online before we buy a car, apply for college, watch movies, buy stocks, vote, go on vacation, or significantly in this case, before we go to the doctor. It’s most valuable to use search data for things like predicting flu outbreaks because there’s typically a longer gap of time between search indicators and any other method of detection.
But while Internet search data may provide the earliest indicators of an outbreak, there are serious limitations. For one, it’s heavily influenced by media reports. When reporters began writing stories about “swine flu,” Internet searches spiked, first in Latin America, then in the United States and around the world. Even Internet searches for specific symptoms typically spike as people rush to learn more about the disease-whether or not they actually have any symptoms. And other infections, like HIV, don’t have an associated family of symptom keywords that can be easily used to detect an outbreak. In addition, Internet search data doesn’t provide any additional information about the severity of the disease (how drug-resistant or fatal it is) or specific demographics involved.
“It’s sort of like a big puzzle, and of course, you want to have all the pieces,” says Brammer. So while Internet search data provides a new tool that helps with early detection, “It’s not going to replace the traditional methods. One system alone is not going to do the job,” she explains. Currently this data is used alongside detailed virologic data from the labs, outpatient surveys, and mortality reports to give public health officials the most complete picture of what’s really going on.
Mapping the Hot Zones
Other new digital technologies are also helping public health officials identify outbreaks around the world. Unlike Google Flu Trends, which uses Internet search data as an indicator, the HealthMap project collects and aggregates information from news reports, blogs, twitter feeds, and mailing lists. Public health officials used to monitor news reports manually, collecting clippings and trying to detect outbreaks. Now, a web crawler searches the Internet every hour and collects this data on a global map with a heat index that indicates the most current updates on emerging health threats.
“HealthMap collects information from the news media, which are typically among the fastest to report this data,” says Clark Freifeld, who developed HealthMap with John Brownstein in MIT’s New Media Medicine Department. It currently searches the web in six languages (English, Spanish, French, Portuguese, Russian, and Chinese) and there are plans to include more soon. Additionally, people can fill in any gaps by adding links to news feeds or sources HealthMap’s web crawler may have missed. As with Google Flu Trends, the main purpose of HealthMap is to provide early warning information.
“We’ve gotten a lot of interest and excitement from CDC and WHO. They use it a lot, and are some of the most frequent visitors to HealthMap,” Freifeld explains. But local public health officials might find it useful as well. A local outbreak of chicken pox would not be of interest to the WHO, but it would be very interesting for nearby health clinics or school districts, allowing them to cancel school activities and advise parents as soon as possible. These kinds of local reports are easy to miss, especially when everyone is being barraged with information. HealthMap allows users to customize their results by disease and by region, so “Instead of wading through a vast stream of data, it filters out the information of interest to a particular region,” Freifeld says.
While these new developments offer powerful tools for public health officials, each one has its strengths and weaknesses. In this case, the information on HealthMap comes from different sources, many of which are not verified. The trick with many of these new digital technologies is to crosscheck them with other sources and incorporate them into existing systems. If used properly, they have vast potential to improve public health.
Can Digital Health Records Improve Public Health?
Another significant technological development on the horizon with public health implications is the nationwide switch to electronic health records-but not simply for the reasons most people are talking about.
The Obama administration is encouraging health care providers to change over to a system of electronic health records that promises to save money, improve patient care, and reduce medical errors by reducing paperwork and making records available to other doctors and specialists. The American Recovery and Reinvestment Act includes $19.5 billion for health information technology, intended to incentivize this transition to electronic health records. While advocates commonly cite the cost savings and quality improvements from EHRs, using these systems has the potential for other surprising benefits.
These benefits can be seen on three levels, according to Dr. David Blumenthal, National Coordinator for Health Information Technology (also known as Obama’s Health IT Czar) at the Department of Health and Human Services, who will oversee this transition nationally. “On an individual level, it makes doctors better doctors,” he says. For example, the EHR systems can include pre-formatted treatment programs for specific conditions to help doctors make sure each patient is receiving the best care. It can also help doctors identify necessary follow-up tests and avoid dangerous drug interactions.
On a “middle level,” EHR systems allow hospitals and clinics to optimize care for groups of patients, allowing them to target certain diseases. “Where these systems are already in use, rates of compliance for diabetes and preventive services have gone up dramatically,” Blumenthal says. As a result, these hospitals are seeing lower mortality rates and lower costs.
Such a system could also help select patients for clinical trials. Patients who choose to be considered for clinical trials would be able to easily submit their health records. Doctors organizing the studies would then much more easily be able to select patients who meet certain criteria.
But other extraordinary benefits are also possible on a larger “population level.” In the same way that collecting Internet search data has created an incredibly valuable source of information, electronic health records would create another unprecedented resource for public health officials and medical researchers. Information that doctors and clinics already share with public health officials can be collected much more quickly and efficiently. “Once this data is collected, it’s relatively easy to assemble into a research database that is very powerful,” Blumenthal says. “With electronic health records, you learn a lot more a lot faster from manmade and natural events.” These systems could, for example, allow doctors to monitor new drugs and vaccines after they’ve been released-and more quickly identify which ones prove dangerous. If these EHR systems are properly designed, additional insights and discoveries would be possible that would ultimately expand our medical knowledge and improve public health.
Currently, the CDC collects information from thousands of participating doctors around the country by asking them to fill out surveys-about new cases they’ve seen of influenza-like illnesses, for example. While this system does work well, it’s inefficient and provides only limited information. Participating doctors are volunteers being asked to complete additional work. As a result, the CDC is sensitive not to overburden its sources and requests as little information as is absolutely necessary. With EHRs, this information would be much easier to transmit and analyze.
With nationwide electronic health records, these limited sets of data could, in theory, also be augmented with anonymous, or what Blumenthal emphasizes is “de-identified,” information that patients choose to share from their records for clinical research purposes. This would allow researchers to identify patterns using basic demographic data, for example. There’s still a long way to go before EHRs are the norm, but it’s important to consider now the possibilities they would afford for protecting public health, and to build systems that will simultaneously reduce cost, improve care, and safeguard personal information.
To do this, developers, of course, must first address privacy concerns, making sure that patients’ confidential medical information is secure. “We’re going to try to communicate early and often with patients to assure them that no stone is being left unturned in protecting the privacy of this information,” Blumenthal explains.
The Health Insurance Portability and Accountability Act, or HIPAA, which established privacy regulations for medical information, places no restrictions on using de-identified data for research purposes. However, for electronic records to play a role in public health, the systems will have to include a mechanism to securely de-identify these records. Currently, the HHS is creating guidelines to help private and public health care providers de-identify electronic health records to protect patients’ confidential medical information.
Unlike Google’s database of Internet search data, the electronic health records won’t be centralized in one enormous database. Furthermore, the records likely won’t all be in the same format; different hospitals, clinics and health care organizations will end up with different systems. Despite some misconceptions, EHR will not be a nationalized system and the government will not control records, though the goal is to set some uniform standards so individual records can easily be transferred between different EHR systems. This compatibility is critical so that when patients move to another state, see a specialist, or end up in an emergency room that uses a different system, authorized doctors will be able to pull up their health records.
Switching to electronic health records won’t happen overnight, and nationwide, it’s expected to take roughly seven years for most private and public health care providers to make the change. Thankfully, that gives people some time to examine different systems and choose ones that allow individual patients, doctors, hospitals and the public at large to take full advantage of these new technologies.
With more than $19 billion from the Recovery Act supporting the switch to electronic health records, the public should get as much return on its investment as possible. The cost savings they will enable in preventive care, coordination, and disease management are the first steps. Developing systems that allow researchers and public health officials to use de-identified health information promises to expand our medical knowledge and ultimately improve our public health-which should be the central goal of using all this fancy new technology anyway.
Bryce Hall is a Media Producer based in Washington, D.C.
Comments on this article



Wow! This is so interesting and exciting! I’m not in the scientific community, but I think that these advancements are a huge step forward for our health care system. It was nice to read an article that made it seem so understandable and pertinent to everyday life.
July 28th, 2009 at 9:39 amThanks!
In clear, understandable prose appealing both to the scientific and lay communities, Hall gives me the information I need to make more informed decisions as a health care consumer, citizen, and voter.
July 29th, 2009 at 1:17 pmi think internet (and so google) guides our life.. we search on it and make decisions. what if given information is fake or wrong?
November 19th, 2009 at 8:12 amUnfortunately, digital records will only facilitate health insurance carriers’ ability to deny coverage for pre-existing conditions. My share of my company’s health coverage got so expensive last year I had to switch to a bare bones policy. After 2 months it went up 18%. And I can’t make a medical claim on it for 2 years or risk them denying coverage as “pre-existing.” So, even though I pay nearly $400 a month for health insurance, I have to live with a painful frozen shoulder problem for the next two years. And if I do have to see a doctor, I’ll have to do it on a “John Doe” basis, paying out of my own pocket. Ironically, my ad agency does a lot of work for a major hospital and sports medicine center. While I work with these people every day and I earn nearly $60K/year, I can’t afford the services they offer. When the middle class cannot afford basic health care, the industry needs to take a cold, hard look at itself in the mirror. The next generation of workers will be even poorer than my generation. It only gets worse from here.
February 8th, 2010 at 1:05 am