Identifying electronic nicotine delivery system brands and flavors on instagram: Natural language processing analysis

Rob Chew; Michael Frederick Wenger; Jamie Elizabeth Guillory; James M. Nonnemaker; Annice Kim

Identifying electronic nicotine delivery system brands and flavors on instagram

Natural language processing analysis

Chew, R., Wenger, M. F., Guillory, J. E., Nonnemaker, J. M., & Kim, A. (2022). Identifying electronic nicotine delivery system brands and flavors on instagram: Natural language processing analysis. Journal of Medical Internet Research, 24(1), e30257. Article e30257. https://doi.org/10.2196/30257

Copy citation

Abstract

Background:
Electronic nicotine delivery system (ENDS) brands, such as JUUL, used social media as a key component of their marketing strategy, which led to massive sales growth from 2015 to 2018. During this time, ENDS use rapidly increased among youths and young adults, with flavored products being particularly popular among these groups.

Objective:
The aim of our study is to develop a named entity recognition (NER) model to identify potential emerging vaping brands and flavors from Instagram post text. NER is a natural language processing task for identifying specific types of words (entities) in text based on the characteristics of the entity and surrounding words.

Methods:
NER models were trained on a labeled data set of 2272 Instagram posts coded for ENDS brands and flavors. We compared three types of NER models—conditional random fields, a residual convolutional neural network, and a fine-tuned distilled bidirectional encoder representations from transformers (FTDB) network—to identify brands and flavors in Instagram posts with key model outcomes of precision, recall, and F1 scores. We used data from Nielsen scanner sales and Wikipedia to create benchmark dictionaries to determine whether brands from established ENDS brand and flavor lists were mentioned in the Instagram posts in our sample. To prevent overfitting, we performed 5-fold cross-validation and reported the mean and SD of the model validation metrics across the folds.

Results:
For brands, the residual convolutional neural network exhibited the highest mean precision (0.797, SD 0.084), and the FTDB exhibited the highest mean recall (0.869, SD 0.103). For flavors, the FTDB exhibited both the highest mean precision (0.860, SD 0.055) and recall (0.801, SD 0.091). All NER models outperformed the benchmark brand and flavor dictionary look-ups on mean precision, recall, and F1. Comparing between the benchmark brand lists, the larger Wikipedia list outperformed the Nielsen list in both precision and recall.

Conclusions:
Our findings suggest that NER models correctly identified ENDS brands and flavors in Instagram posts at rates competitive with, or better than, others in the published literature. Brands identified during manual annotation showed little overlap with those in Nielsen scanner data, suggesting that NER models may capture emerging brands with limited sales and distribution. NER models address the challenges of manual brand identification and can be used to support future infodemiology and infoveillance studies. Brands identified on social media should be cross-validated with Nielsen and other data sources to differentiate emerging brands that have become established from those with limited sales and distribution.

Publications Info

To contact an RTI author, request a report, or for additional information about publications by our experts, send us your request.

publications@rti.org

RTI shares its evidence-based research - through peer-reviewed publications and media - to ensure that it is accessible for others to build on, in line with our mission and scientific standards.

Meet the Experts

Navigate to Robert Chew

Annice Kim

Recent Publications

Article

Dynamic operation of a bench-scale CO2 capture system with non-aqueous and monoethanolamine solvents in process-intensified equipment

September 2026

Article

Use of fentanyl test strips by people who inject drugs: Longitudinal findings from the south Atlantic fentanyl test strip study (SAFTSS)

August 2026

Article

Oral toxicokinetics of the indoor air pollutant, α-pinene, and its genotoxic metabolite, α-pinene oxide, in rodents and comparison to inhalation route of exposure

August 2026

Article

Implementation of the IWQOL-Lite-CT in observational research: Comparison of baseline scores with a clinical trial population and psychometric evaluation

August 2026

Article

Racial differences in adverse pregnancy outcomes and incident hypertension: A mediation analysis

July 2026

Article

Mental health, substance use, and child maltreatment

July 2026

Article

Global research requires global researchers: Opportunities and challenges for capacity-building

July 2026

Article

Impact of enhanced practices on opioid overdose deaths: A community-based modeling approach

July 2026

View All Publications

Identifying electronic nicotine delivery system brands and flavors on instagram

Abstract

Meet the Experts

Robert Chew

Michael Wenger

Jamie Guillory

Annice Kim

Recent Publications

Dynamic operation of a bench-scale CO2 capture system with non-aqueous and monoethanolamine solvents in process-intensified equipment

Use of fentanyl test strips by people who inject drugs: Longitudinal findings from the south Atlantic fentanyl test strip study (SAFTSS)

Oral toxicokinetics of the indoor air pollutant, α-pinene, and its genotoxic metabolite, α-pinene oxide, in rodents and comparison to inhalation route of exposure

Implementation of the IWQOL-Lite-CT in observational research: Comparison of baseline scores with a clinical trial population and psychometric evaluation

Racial differences in adverse pregnancy outcomes and incident hypertension: A mediation analysis

Mental health, substance use, and child maltreatment

Global research requires global researchers: Opportunities and challenges for capacity-building

Impact of enhanced practices on opioid overdose deaths: A community-based modeling approach

RTI International and Othram awarded NIJ funding for major study of forensic genetic genealogy across ancestral populations

New Approach Methodologies: Why Scientific Rigor Matters More Than Ever

Youth tobacco use continues to decline: RTI publishes results of the 2025 National Youth Tobacco Survey in partnership with FDA

Cogeneration’s Advantage: Efficiency, Resilience, and the Case for Captured Heat

Turning Clean Energy Investment into Economic Growth in North Carolina

Supporting Defense Innovation Through North Carolina’s Smart Textile Ecosystem

Microplastics in the Public Eye: What Consumers Are Saying—and Why It Matters

Current Nutrition Trends: Fact, Fiction, and Half-Truths

Landmark 10-year clinical study finds lasting benefit for women with two distinct pelvic organ prolapse surgeries

Evaluating Alternative Strategies to Traditional Local Police Response