Next Article in Journal
National Evaluation of Canadian Multi-Service FASD Prevention Programs: Interim Findings from the Co-Creating Evidence Study
Next Article in Special Issue
Factors Associated with Single-Use and Co-Use of Tobacco and Alcohol: A Multinomial Modeling Approach
Previous Article in Journal
Risk Assessment of Miners’ Unsafe Behaviors: A Case Study of Gas Explosion Accidents in Coal Mine, China
Previous Article in Special Issue
Depression Fully Mediates the Effect of Multimorbidity on Self-Rated Health for Economically Disadvantaged African American Men but Not Women
Open AccessArticle

Pride, Love, and Twitter Rants: Combining Machine Learning and Qualitative Techniques to Understand What Our Tweets Reveal about Race in the US

1
Department of Epidemiology & Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
2
Department of Health Science, Furman University, Greenville, SC 29613, USA
3
Divisions of Community Health Sciences and Epidemiology, University of California, Berkeley, CA 94704, USA
4
Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA 02215, USA
5
Program of Public Health Science, University of Maryland School of Public Health, College Park, MD 20742, USA
6
Department of Health Sciences, College of Science and Health, DePaul University, Chicago, IL 60614, USA
7
Department of Epidemiology & Biostatistics, University of Maryland School of Public Health, College Park, MD 20742, USA
*
Author to whom correspondence should be addressed.
Int. J. Environ. Res. Public Health 2019, 16(10), 1766; https://doi.org/10.3390/ijerph16101766
Received: 13 February 2019 / Revised: 7 May 2019 / Accepted: 15 May 2019 / Published: 18 May 2019
Objective: Describe variation in sentiment of tweets using race-related terms and identify themes characterizing the social climate related to race. Methods: We applied a Stochastic Gradient Descent Classifier to conduct sentiment analysis of 1,249,653 US tweets using race-related terms from 2015–2016. To evaluate accuracy, manual labels were compared against computer labels for a random subset of 6600 tweets. We conducted qualitative content analysis on a random sample of 2100 tweets. Results: Agreement between computer labels and manual labels was 74%. Tweets referencing Middle Eastern groups (12.5%) or Blacks (13.8%) had the lowest positive sentiment compared to tweets referencing Asians (17.7%) and Hispanics (17.5%). Qualitative content analysis revealed most tweets were represented by the categories: negative sentiment (45%), positive sentiment such as pride in culture (25%), and navigating relationships (15%). While all tweets use one or more race-related terms, negative sentiment tweets which were not derogatory or whose central topic was not about race were common. Conclusion: This study harnesses relatively untapped social media data to develop a novel area-level measure of social context (sentiment scores) and highlights some of the challenges in doing this work. New approaches to measuring the social environment may enhance research on social context and health. View Full-Text
Keywords: social media; minority groups; discrimination; big data; content analysis social media; minority groups; discrimination; big data; content analysis
Show Figures

Figure 1

MDPI and ACS Style

Nguyen, T.T.; Criss, S.; Allen, A.M.; Glymour, M.M.; Phan, L.; Trevino, R.; Dasari, S.; Nguyen, Q.C. Pride, Love, and Twitter Rants: Combining Machine Learning and Qualitative Techniques to Understand What Our Tweets Reveal about Race in the US. Int. J. Environ. Res. Public Health 2019, 16, 1766. https://doi.org/10.3390/ijerph16101766

AMA Style

Nguyen TT, Criss S, Allen AM, Glymour MM, Phan L, Trevino R, Dasari S, Nguyen QC. Pride, Love, and Twitter Rants: Combining Machine Learning and Qualitative Techniques to Understand What Our Tweets Reveal about Race in the US. International Journal of Environmental Research and Public Health. 2019; 16(10):1766. https://doi.org/10.3390/ijerph16101766

Chicago/Turabian Style

Nguyen, Thu T.; Criss, Shaniece; Allen, Amani M.; Glymour, M. M.; Phan, Lynn; Trevino, Ryan; Dasari, Shrikha; Nguyen, Quynh C. 2019. "Pride, Love, and Twitter Rants: Combining Machine Learning and Qualitative Techniques to Understand What Our Tweets Reveal about Race in the US" Int. J. Environ. Res. Public Health 16, no. 10: 1766. https://doi.org/10.3390/ijerph16101766

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop