Augmenting official flu reports from the Centers for Disease Control and Prevention (CDC) with data harvested from the internet is another step in our online evolution. According to a 2012 Pew Research Center study, about 184 million Americans (more than half the nation’s residents) use the Web to find health-related information. These searches are like tips to a crime hotline, enabling researchers to identify suspected flu cases. In 2006, Gunther Eysenbach, associate professor of public health at the University of Toronto, found that searches for the terms “flu” or “flu symptoms” spiked a week before a jump in doctor visits. “The internet has made measurable what was previously immeasurable,” he wrote in 2006, christening the new field “infodemiology.” In 2008, Google rolled out Flu Trends, harnessing its own big data to look for worldwide flu surges and hot spots through symptom searches in 29 countries. Google scrapped the program in 2014—because of at least one factor that researchers hadn’t counted on. Your search history, it turns out, can be misleading. It’s impossible for data collectors to know whether you were looking up “headache and fever” for yourself, or because you heard your co-worker complaining about their kid’s symptoms. In 2007, Americans suddenly started Googling “cholera”—had a new epidemic taken hold? Nope. Oprah Winfrey had just recommended Love in the Time of Cholera for her book club. “You should have seen what happened when Brad Pitt had viral meningitis,” says Lone Simonsen, professor of epidemiology at Roskilde University. After culling search data from public resources, researchers run them through complex algorithms. These formulas reveal patterns that investigators can then compare with whatever the CDC or other health agencies report about the sickness. If a computer-generated prediction matches reality, we know the experts are onto something. Search queries aren’t the only vein of data that researchers mine for flu clues. Svitlana Volkova, a data scientist at the Pacific Northwest National Laboratory, looks for gems of information on Twitter. She recently verified a new deep-learning method that probes tweets for signs of the flu. In an analysis of more than 170 million tweets posted over three years, Volkova and her colleagues found their model could accurately produce three-day forecasts of flu-like illnesses at a local level. That’s much quicker than waiting for flu reports from the CDC, which lag up to two weeks behind what’s happening in the world. (Facebook says it’s not in the flu-predicting business, so for now, your sick emoji doesn’t serve a greater good.) Social media adds more data for researchers to work with, but it still has limitations. Annoyingly, the image we present online doesn’t always match the mucus-plagued person we are at home. Michael Paul, an information scientist at the University of Colorado at Boulder, recently found that people rarely tweet about their flu-like symptoms. In fact, the researchers found that people tweet less when they’re ill. So the next time your favorite Twitter personality seems oddly quiet, it could be because they’re sick of Twitter—but it might just be that they’re sick. Paul also investigated Instagram and found that acute illness is the least-common health topic for photo posting. Not surprisingly, flu-ridden people don’t love taking selfies. Disease detectives, including Simonsen, hope that electronic health records could augment data from our tweets and posts. Insurance-claim forms, which list ailments and how they were treated, are particularly crucial. But people are typically reluctant to share private health data with researchers. Epidemiologists would like to calm those privacy worries. They want only the numbers, never the names. But the final call ultimately lies with individuals. The public, Simonsen says, must weigh the balances: “Privacy on one side and the need to know more on the other.” That deliberation is even more pertinent since the EU implemented the General Data Protection Regulation this year—giving people more say in how their information is used. Adding information from an app used to log health status—just as we do with fitness trackers or diet programs—could make big data-based flu forecasts even more accurate, Simonsen says. And private companies might come around: UNICEF is working with several, including IBM, to gather data in order to improve responses to global illnesses. Ultimately, the potential for big data to predict the next flu pandemic might depend on people around the globe all oversharing our illnesses. The more we tweet about our #flu symptoms, the more data we generate. The more we allow companies to share that data with researchers, the more accurate they can make their predictions. And all that sharing, Volkova says, “will help the world.” This article was originally published in the Winter 2018 Danger issue of Popular Science.