Social Monitoring for Public Health. Michael J. Paul
Читать онлайн книгу.working successfully in policy development or assurance. Rather, we find plenty of interesting problems in assessment, and we’re sure you will too!
2.1.1 PUBLIC HEALTH SURVEILLANCE
Public health surveillance concerns the “continuous, systematic collection, analysis and interpretation of health data.”4 This includes monitoring for existing identified health concerns as well as discovering new issues. You may also hear the term syndromic surveillance, which is surveillance of a specific syndrome (a set of related symptoms).
Consider infectious disease surveillance, which is one of the largest and most widespread examples of public health surveillance. The United States has a fairly robust national surveillance system for infectious diseases. Perhaps the largest surveillance system is FluView,5 the Centers for Disease Control and Prevention’s (CDC) national influenza monitoring system. FluView encompasses several sources of data, including ILINet, a network of thousands of clinics throughout the United States that report weekly statistics on patients presenting with influenza-like illness. These reports, along with virology reports and other sources, make up a weekly CDC report that tracks the rate of influenza infection. A similar process is replicated on the state and local level in many jurisdictions, and many U.S. states produce regular flu reports. Due to its popularity as an application for the use of social media data, we’ll discuss influenza surveillance in detail in Section 5.1.1.
National infectious disease surveillance extends to other notifiable infectious diseases, illnesses in which a physician is required to notify public health authorities of an infection. Examples include measles, ebola, and dengue.6 Surveillance also extends to discovering new illnesses. This is how AIDS was originally identified, by the CDC pro-actively investigating unexplained infections [Curran et al., 2011].
Surveillance goes well beyond infectious diseases. Surveillance can identify novel tobacco products [Ayers et al., 2011a, Stanfill et al., 2011] and adverse reactions to medications [Budnitz et al., 2006], both of which have been achieved using social media (see Sections 5.2.2 and 5.4.2, respectively).
2.2 SOURCES OF DATA
Public health depends on data about populations to support its goals. Traditionally, public health draws data from two main sources.
The first is surveys, in particular telephone surveys, which have long been the backbone of public health. There are several, large-scale surveys run on a regular basis (typically annually) that provide a steady supply of public health data. Examples include the Behavioral Risk Factor Surveillance System (BRFSS) and the National Immunization Survey. BRFSS is run annually and collects detailed data from more than 300,000 Americans on a wide variety of public health topics, including access to medical care, mental health, exercise, and tobacco use. Some large-scale surveys rely on in-person interviews, such as the annual U.S. National Survey on Drug Use and Health (NSDUH). Beyond these repeated surveys, many researchers commission onetime telephone or in-person surveys. These can include focus groups, which provide more free-flowing sources of information on public beliefs and attitudes. Online surveys are also growing in popularity due to their low cost, though numerous quality challenges remain [Cook et al., 2000, Dredze et al., 2015, Eysenbach and Wyatt, 2002]. Finally, many private polling companies also conduct health-related phone surveys. For example, Gallup uses phone surveys to measure the well-being of Americans.7 A growing thread of work with social media data consider methods for enhancing or replacing traditional survey mechanisms [Benton et al., 2016b].
The second primary source of data come from clinical encounters. The influenza surveillance network described above, ILINet, is the largest such example. Large-scale surveillance networks require significant coordination as they rely on active reporting from clinics. More recently, researchers have turned to automated methods run on electronic medical records that enable scalability and reduce the strain on manual reporters.8
While these are the most common data sources for public health, the field has a tradition of seeking new and creative sources of data suited to specific analyses. These include monitoring drug sales and pharmacy records [Heffernan et al., 2004, Magruder et al., 2004] to track gastrointestinal illness [Edge et al., 2004] and use of nicotine replacement therapies [Metzger et al., 2005]. Others have used insurance company billing records to track mammographies [Smith-Bindman et al., 2006] and cardiovascular disease [Lentine et al., 2009]. Some unusual data sources include counting cigarette butt waste in cities [Marah and Novotny, 2011] and estimating community drug abuse by wastewater analysis [Irvine et al., 2011, van Nuijs et al., 2011, Zuccato et al., 2008].
2.2.1 LIMITATIONS OF TRADITIONAL DATA
Monitoring practices that rely on traditional data sources have their advantages and limitations. In general, these methods are well understood and are viewed as reliable, provided they are properly analyzed with biases corrected. Furthermore, many of these data sources go back many years (e.g., annual survey questions), allowing for comparisons over time.
However, we wouldn’t be writing this book if there weren’t disadvantages to traditional methods and thus opportunities for social media data to make improvements. In the case of telephone surveys, they are becoming less accurate over time, as fewer people use landline phones, and the response rate drops [Kempf and Remington, 2007]. This introduces particular bias against low-income and young adults in survey results [Blumberg and Luke, 2007]. Surveys are also expensive to conduct, especially if the survey size is very large or requires in-person interviews [Iannacchione, 2011]. The NSDUH survey mentioned above takes nine months to complete each year.
Clinical records address some of these issues, but are still expensive and complex to set up. Many of the topics covered in surveys do not appear in clinical records, or if they do, they are in unstructured text and thus hard to analyze. Both of these methods can be slow. We cannot measure today’s influenza rate when we do not get clinical records or sentinel site reports more frequently than once a week. Finally, these methods can only cover certain topics, as discussed in Chapter 1. Many areas of public health are understudied because they lack sufficient data to support research.
2.2.2 OPPORTUNITIES FOR SOCIAL MONITORING
These limitations create opportunities for researchers and practitioners to use social media as a data source for learning about health and medicine [Grajales III et al., 2014]. Compared to traditional public health monitoring, social media-based monitoring is fast, cheap, covers a large population, and provides data on topics with little coverage from traditional sources.
One of the most popular social media platforms for health research has been Twitter [Williams et al., 2013], which provides real-time streams of public data, often for free. This type of data creates the potential for real-time health surveillance, which is generally unattainable with traditional methods.
Certainly social media is not a panacea for all problems, and will not replace traditional data sources. We’ll discuss some of these limitations in detail in Chapter 6. However, social media can play a complementary role to traditional monitoring. For example, social media analysis can be used for hypothesis generation [Parker et al., 2015]: rapidly testing out ideas that are not yet worth the time and effort of traditional data collection. The most promising ideas can be forwarded to a more in-depth phase of traditional investigation. Social media can also complement survey data with respect to its demographic coverage. Young adults are overrepresented on Twitter [Duggan et al., 2015] yet underrepresented in telephone surveys, an especially important characteristic for topics like electronic cigarettes and illicit drug use.
A growing chorus of researchers argue that social media will play an important role in public health and epidemiology [Brownstein et al., 2009, Dredze, 2012, Salathé et al., 2012, 2013b]. The U.S. government has taken notice and has started to consider how social media data can aid public health efforts. This has included hosting competitions for building social media-based systems for disease surveillance [Biggerstaff