Public Engagement with Lung Cancer Screening Information: Topic Modeling of Lung Cancer-Related Reddit Posts
Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Data Sources and Subreddit Selection
2.2. Data Collection
2.3. Data Preprocessing
2.4. Topic Modeling and Thematic Categorization
2.5. Keyword-Based Classification
3. Results
3.1. Dataset Description
3.2. Keyword-Based Thematic Distribution
3.3. Keyword-Based Temporal Trends
3.4. Topic Modeling Results Using LDA
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhou, J.; Xu, Y.; Liu, J.; Feng, L.; Yu, J.; Chen, D. Global burden of lung cancer in 2022 and projections to 2050, Incidence and mortality estimates from GLOBOCAN. Cancer Epidemiol. 2024, 93, 102693. [Google Scholar] [CrossRef]
- Bray, F.; Laversanne, M.; Sung, H.; Ferlay, J.; Siegel, R.L.; Soerjomataram, I.; Jemal, A. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2024, 74, 229–263. [Google Scholar] [CrossRef] [PubMed]
- National Lung Screening Trial Research Team; Aberle, D.R.; Adams, A.M.; Berg, C.D.; Black, W.C.; Clapp, J.D.; Fagerstrom, R.M.; Gareen, I.F.; Gatsonis, C.; Marcus, P.M.; et al. Reduced lung-cancer mortality with low-dose computed tomographic screening. N. Engl. J. Med. 2011, 365, 395–409. [Google Scholar] [CrossRef] [PubMed]
- de Koning, H.J.; van der Aalst, C.M.; de Jong, P.A.; Scholten, E.T.; Nackaerts, K.; Heuvelmans, M.A.; Lammers, J.J.; Weenink, C.; Yousaf-Khan, U.; Horeweg, N.; et al. Reduced Lung-Cancer Mortality with Volume CT Screening in a Randomized Trial. N. Engl. J. Med. 2020, 382, 503–513. [Google Scholar] [CrossRef] [PubMed]
- Lancaster, H.L.; Heuvelmans, M.A.; Oudkerk, M. Low-dose computed tomography lung cancer screening: Clinical evidence and implementation research. J. Intern. Med. 2022, 292, 68–80. [Google Scholar] [CrossRef]
- Sellmann, L.; Fenchel, K.; Dempke, W.C. Improved overall survival following tyrosine kinase inhibitor treatment in advanced or metastatic non-small-cell lung cancer-the Holy Grail in cancer treatment? Transl. Lung Cancer Res. 2015, 4, 223–227. [Google Scholar] [CrossRef]
- Targeted Drug Therapy for Non-Small Cell Lung Cancer. American Cancer Society. Available online: https://www.cancer.org/cancer/types/lung-cancer/treating-non-small-cell/targeted-therapies.html (accessed on 20 June 2025).
- Amicizia, D.; Piazza, M.F.; Marchini, F.; Astengo, M.; Grammatico, F.; Battaglini, A.; Schenone, I.; Sticchi, C.; Lavieri, R.; Di Silverio, B.; et al. Systematic Review of Lung Cancer Screening: Advancements and Strategies for Implementation. Healthcare 2023, 11, 2085. [Google Scholar] [CrossRef]
- Piñeiro, B.; Simmons, V.N.; Palmer, A.M.; Correa, J.B.; Brandon, T.H. Smoking cessation interventions within the context of Low-Dose Computed Tomography lung cancer screening: A systematic review. Lung Cancer 2016, 98, 91–98. [Google Scholar] [CrossRef]
- US Preventive Services Task Force; Krist, A.H.; Davidson, K.W.; Mangione, C.M.; Barry, M.J.; Cabana, M.; Caughey, A.B.; Davis, E.M.; Donahue, K.E.; Doubeni, C.A.; et al. Screening for Lung Cancer: US Preventive Services Task Force Recommendation Statement. JAMA 2021, 325, 962–970. [Google Scholar] [CrossRef]
- Wolf, A.M.D.; Oeffinger, K.C.; Shih, T.Y.; Walter, L.C.; Church, T.R.; Fontham, E.T.H.; Elkin, E.B.; Etzioni, R.D.; Guerra, C.E.; Perkins, R.B.; et al. Screening for lung cancer: 2023 guideline update from the American Cancer Society. CA Cancer J. Clin. 2024, 74, 50–81. [Google Scholar] [CrossRef]
- American Lung Association. Key Findings. Available online: https://www.lung.org/research/state-of-lung-cancer/key-findings (accessed on 20 June 2025).
- Rehman, S.; Lim, M.; Sidhu, R.; Ramis, P.; Rohren, E. Barriers to lung cancer screening. Cancer Epidemiol. 2025, 94, 102722. [Google Scholar] [CrossRef]
- Kota, K.J.; Ji, S.; Bover-Manderski, M.T.; Delnevo, C.D.; Steinberg, M.B. Lung Cancer Screening Knowledge and Perceived Barriers Among Physicians in the United States. JTO Clin. Res. Rep. 2022, 3, 100331. [Google Scholar] [CrossRef] [PubMed]
- Sedani, A.E.; Davis, O.C.; Clifton, S.C.; Campbell, J.E.; Chou, A.F. Facilitators and Barriers to Implementation of Lung Cancer Screening: A Framework-Driven Systematic Review. J. Natl. Cancer Inst. 2022, 114, 1449–1467. [Google Scholar] [CrossRef]
- Wu, V.S.; Boutros, C.; Bassiri, A.; Jiang, B.; Sinopoli, J.; Tapias-Vargas, L.; Linden, P.A.; Towe, C.W. Reassessing Efficacy: Understanding Failures in Lung Cancer Screening Despite Low-Dose CT Protocol Adherence. Ann. Thorac. Surg. 2025, 120, 531–539. [Google Scholar] [CrossRef] [PubMed]
- Cavers, D.; Nelson, M.; Rostron, J.; Robb, K.A.; Brown, L.R.; Campbell, C.; Akram, A.R.; Dickie, G.; Mackean, M.; van Beek, E.J.R.; et al. Understanding patient barriers and facilitators to uptake of lung screening using low dose computed tomography: A mixed methods scoping review of the current literature. Respir. Res. 2022, 23, 374. [Google Scholar] [CrossRef]
- Triplette, M.; Thayer, J.H.; Pipavath, S.N.; Crothers, K. Poor Uptake of Lung Cancer Screening: Opportunities for Improvement. J. Am. Coll. Radiol. 2019, 16, 446–450. [Google Scholar] [CrossRef] [PubMed]
- Nierengarten, M.B. Updated American Cancer Society lung cancer screening guidelines: The new guidelines offer expanded criteria recommended for lung cancer screening based on age, smoking status, and smoking history. Cancer 2024, 130, 656–657. [Google Scholar] [CrossRef]
- Chan, G.J.; Fung, M.; Warrington, J.; Nowak, S.A. Understanding Health-Related Discussions on Reddit: Development of a Topic Assignment Method and Exploratory Analysis. JMIR Form. Res. 2025, 9, e55309. [Google Scholar] [CrossRef]
- Yin, J. Navigating Stress and Seeking Support on Reddit: A User-Centered Study of Online Support-Seeking Behaviors. Escholarship.org. Available online: https://escholarship.org/uc/item/83r1t48j (accessed on 20 June 2025).
- Reddit by the Numbers. Reddit. Homepage. Available online: https://redditinc.com/ (accessed on 18 July 2025).
- Rocha-Silva, T.; Nogueira, C.; Rodrigues, L. Passive data collection on Reddit: A practical approach. Res. Ethics 2024, 20, 453–470. [Google Scholar] [CrossRef]
- Britt, R.K.; Franco, C.L.; Jones, N. Trends and challenges within Reddit and health communication research: A systematic review. Commun. Public 2023, 8, 402–417. [Google Scholar] [CrossRef]
- Rani, S.; Ahmed, K.; Subramani, S. From Posts to Knowledge: Annotating a Pandemic-Era Reddit Dataset to Navigate Mental Health Narratives. Appl. Sci. 2024, 14, 1547. [Google Scholar] [CrossRef]
- Ricard, B.J.; Hassanpour, S. Deep Learning for Identification of Alcohol-Related Content on Social Media (Reddit and Twitter): Exploratory Analysis of Alcohol-Related Outcomes. J. Med. Internet Res. 2021, 23, e27314. [Google Scholar] [CrossRef]
- Cherven, B.; Fitch, K.D.; Nijeboer, E.; Klosky, J.L.; Lehmann, V. Online discussions about cancer and fertility: An analysis of Reddit threads. J. Assist. Reprod. Genet. 2025, 42, 1425–1434. [Google Scholar] [CrossRef]
- Proferes, N.; Jones, N.; Gilbert, S.; Fiesler, C.; Zimmer, M. Studying Reddit: A Systematic Overview of Disciplines, Approaches, Methods, and Ethics. Soc. Media Soc. 2017, 18, 497–503. [Google Scholar] [CrossRef]
- Morrison, E.J.; Novotny, P.J.; Sloan, J.A.; Yang, P.; Patten, C.A.; Ruddy, K.J.; Clark, M.M. Emotional Problems, Quality of Life, and Symptom Burden in Patients With Lung Cancer. Clin. Lung Cancer 2017, 18, 497–503. [Google Scholar] [CrossRef]
- Brown Johnson, C.G.; Brodsky, J.L.; Cataldo, J.K. Lung cancer stigma, anxiety, depression, and quality of life. J. Psychosoc. Oncol. 2014, 32, 59–73. [Google Scholar] [CrossRef] [PubMed]
- Sutton, J.; Vos, S.C.; Olson, M.K.; Woods, C.; Cohen, E.; Gibson, C.B.; Phillips, N.E.; Studts, J.L.; Eberth, J.M.; Butts, C.T. Lung Cancer Messages on Twitter: Content Analysis and Evaluation. J. Am. Coll. Radiol. 2018, 15, 210–217. [Google Scholar] [CrossRef]
- Taylor, J.; Pagliari, C. The social dynamics of lung cancer talk on Twitter, Facebook and Macmillan.org.uk. NPJ Digit. Med. 2019, 2, 51. [Google Scholar] [CrossRef]
- Podina, I.R.; Bucur, A.M.; Todea, D.; Fodor, L.; Luca, A.; Dinu, L.P.; Boian, R.F. Mental health at different stages of cancer survival: A natural language processing study of Reddit posts. Front. Psychol. 2023, 14, 1150227. [Google Scholar] [CrossRef]
- Thomas, J.; Zheng, R.; Prabhu, A.V.; Heron, D.E.; Beriwal, S. Content Analysis of Posts About Cancer on the Social Media Website Reddit. Int. J. Radiat. Oncol. Biol. Phys. 2019, 105, E461–E462. [Google Scholar] [CrossRef]
- PRAW: The Python Reddit API Wrapper. Available online: https://praw.readthedocs.io/en/latest/ (accessed on 11 June 2025).
- Moreno, M.A.; Goniu, N.; Moreno, P.S.; Diekema, D. Ethics of social media research: Common concerns and practical considerations. Cyberpsychol Behav. Soc. Netw. 2013, 16, 708–713. [Google Scholar] [CrossRef] [PubMed]
- Bird, S.; Klein, E.; Loper, E. Natural Language Processing with Python, 1st ed.; O’Reilly Media: Sebastopol, CA, USA, 2009. [Google Scholar]
- Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent Dirichlet allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
- Řehůřek, R.; Sojka, P. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP, Valletta, Malta, 22 May 2010; pp. 45–50. [Google Scholar] [CrossRef]
- Morstatter, F.; Liu, H. In search of coherence and consensus: Measuring the interpretability of statistical topics. J. Mach. Learn. Res. 2018, 18, 1–32. [Google Scholar]
- McHugh, M.L. Interrater reliability: The kappa statistic. Biochem. Med. 2012, 22, 276–282. [Google Scholar] [CrossRef]
- Chen, J.; Wang, Y. Social Media Use for Health Purposes: Systematic Review. J. Med. Internet Res. 2021, 23, e17917. [Google Scholar] [CrossRef] [PubMed]
- Rayland, A.; Andrews, J. From Social Network to Peer Support Network: Opportunities to Explore Mechanisms of Online Peer Support for Mental Health. JMIR Ment. Health 2023, 10, e41855. [Google Scholar] [CrossRef] [PubMed]
- Park, A.; Conway, M.; Chen, A.T. Examining Thematic Similarity, Difference, and Membership in Three Online Mental Health Communities from Reddit: A Text Mining and Visualization Approach. Comput. Hum. Behav. 2018, 78, 98–112. [Google Scholar] [CrossRef]
- Low, D.M.; Rumker, L.; Talkar, T.; Torous, J.; Cecchi, G.; Ghosh, S.S. Natural Language Processing Reveals Vulnerable Mental Health Support Groups and Heightened Health Anxiety on Reddit During COVID-19, Observational Study. J. Med. Internet Res. 2020, 22, e22635. [Google Scholar] [CrossRef]
- Sullivan, M.; Hancock, J.; Shaw, G.; Ni, C. Health information-seeking on Reddit, by people who use opioids. Inf. Res. Int. Electron. J. 2024, 29, 109–125. [Google Scholar] [CrossRef]
- Priya, S.; Sequeira, R.; Chandra, J.; Dandapat, S.K. Where should one get news updates: Twitter or Reddit. Online Soc. Netw. Media 2019, 9, 17–29. [Google Scholar]
- Garg, R.; Rebić, N.; De Vera, M.A. Information Needs About Cancer Treatment, Fertility, and Pregnancy: Qualitative Descriptive Study of Reddit Threads. JMIR Cancer 2020, 6, e17771. [Google Scholar] [CrossRef] [PubMed]
- Lazard, A.J.; Collins, M.K.R.; Hedrick, A.; Varma, T.; Love, B.; Valle, C.G.; Brooks, E.; Benedict, C. Using Social Media for Peer-to-Peer Cancer Support: Interviews With Young Adults With Cancer. JMIR Cancer 2021, 7, e28234. [Google Scholar] [CrossRef]
- Bender, J.L.; Jimenez-Marroquin, M.C.; Jadad, A.R. Seeking support on Facebook: A content analysis of breast cancer groups. J. Med. Internet Res. 2011, 13, e16. [Google Scholar] [CrossRef]
- Zabora, J.; BrintzenhofeSzoc, K.; Curbow, B.; Hooker, C.; Piantadosi, S. The prevalence of psychological distress by cancer site. Psychooncology 2001, 10, 19–28. [Google Scholar] [CrossRef]
- Hamm, M.P.; Chisholm, A.; Shulhan, J.; Milne, A.; Scott, S.D.; Given, L.M.; Hartling, L. Social media use among patients and caregivers: A scoping review. BMJ Open 2013, 3, e002819. [Google Scholar] [CrossRef] [PubMed]
- Strong, A.; Renaud, M. Using Social Media as a Platform for Increasing Knowledge of Lung Cancer Screening in High-Risk Patients. J. Adv. Pract. Oncol. 2020, 11, 453–459. [Google Scholar] [CrossRef]
- Ban, S.; Kim, Y.; Seomun, G. Digital health literacy: A concept analysis. Digit. Health 2024, 10, 20552076241287894. [Google Scholar] [CrossRef]
- Duarte, F. Reddit User Age, Gender, & Demographics. Exploding Topics. 2025. Available online: https://explodingtopics.com/blog/reddit-users (accessed on 10 July 2025).
- Pew Research Center. Social Media Fact Sheet. 2024. Available online: https://www.pewresearch.org/internet/fact-sheet/social-media/ (accessed on 10 July 2025).
- Key Statistics for Lung Cancer. Available online: https://www.cancer.org/cancer/types/lung-cancer/about/key-statistics.html (accessed on 15 July 2025).
Category | Example Posts |
---|---|
Treatment | 1. It took me about 6/7 months after treatment ended to work up to walking a mile and 10 months to build back to walking 2 miles. I was 58 at that time……chemo regimen was much more difficult then. |
2. Not sure if she will qualify since they usually do the chemo and immunotherapy first then surgery. | |
3. I just got my latest scan results back on Monday……Immunotherapy is working……I’m so happy……I’m continuing with treatment and see what the next round of scans says in July! | |
Mental health | 1. This is a brutal disease and it breaks my heart that so many other people have to suffer through it. I’m so grateful to have this supportive group, because no one should have to go through this on their own. |
2. Friends and family try their hardest but they don’t understand nor do I expect them to……I don’t want people to see me as just a sick girl. I don’t want all the memories I have with friends to be forever overwhelmed by them seeing me on my death bed. | |
3. One of my close friends lost his dad to cancer, the same kind that my dad is currently suffering from. My friend told me that what helped him was to not rush the process. | |
Smoking | 1. In my experience the feeling does go away. It will take some time, but cravings will stop. Sometimes when I’m out and I smell a freshly lit cigarette I get a momentary craving, but that’s easily squashed with resolve and will power. |
2. I still think about smoking most days, usually in the evening when I used to hang out in the garage and have a cig and a beer. I miss that quite a bit, and don’t want to be chained to that longing forever. | |
3. I should stop now before the addiction gets worse……dealing with withdrawals when I quit, I would close my eyes for a moment, do a deep breathing exercise, and tell myself "It’s going to pass whether you have a cigarette or not. Just ride it out. | |
Screening | 1. I met with a pulmonologist a few weeks after I was initial diagnosed. He didn’t do much but say that we needed a biopsy. I got the biopsy, which confirmed it was cancer, and oncology took over. |
2. Feeling very terrified right now. My mother had a CT scan which showed she has multiple lung nodules, and she has never been a smoker. She has asthma and uses an inhaler, and has had several spouts of coughing which has now gone away, and she has never coughed up blood before……I don’t know what the lung nodules mean or whether they mean cancer, or how common finding a non-benign nodule is. |
Category | Relevant Words | Posts Numbers | Percentage |
---|---|---|---|
Treatment | effects, radiation, terminal, insurance, recovery, remove, nausea, therapy, palliative | 17,700 | 16.84% |
Mental health | advice, helpful, care, support, family, peace, panic, hugs, anxiety | 75,497 | 71.82% |
Smoking | quit, smoke, vaping, uncomfortable, progress, patches, withdrawal | 8724 | 8.30% |
Screening | symptoms, early, doctor, risk, rare, scan, biopsy, test | 3197 | 3.04% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jaiswal, A.; Amin, S.; Amin, S.M.S.; Lee, D.N.; Park, S.L.; Pokhrel, P. Public Engagement with Lung Cancer Screening Information: Topic Modeling of Lung Cancer-Related Reddit Posts. Curr. Oncol. 2025, 32, 529. https://doi.org/10.3390/curroncol32100529
Jaiswal A, Amin S, Amin SMS, Lee DN, Park SL, Pokhrel P. Public Engagement with Lung Cancer Screening Information: Topic Modeling of Lung Cancer-Related Reddit Posts. Current Oncology. 2025; 32(10):529. https://doi.org/10.3390/curroncol32100529
Chicago/Turabian StyleJaiswal, Aditi, Samia Amin, Sayed M. S. Amin, Donghee Nicole Lee, Sungshim Lani Park, and Pallav Pokhrel. 2025. "Public Engagement with Lung Cancer Screening Information: Topic Modeling of Lung Cancer-Related Reddit Posts" Current Oncology 32, no. 10: 529. https://doi.org/10.3390/curroncol32100529
APA StyleJaiswal, A., Amin, S., Amin, S. M. S., Lee, D. N., Park, S. L., & Pokhrel, P. (2025). Public Engagement with Lung Cancer Screening Information: Topic Modeling of Lung Cancer-Related Reddit Posts. Current Oncology, 32(10), 529. https://doi.org/10.3390/curroncol32100529