RTI uses cookies to offer you the best experience online. By clicking “accept” on this website, you opt in and you agree to the use of cookies. If you would like to know more about how RTI uses cookies and how to manage them please view our Privacy Policy here. You can “opt out” or change your mind by visiting: http://optout.aboutads.info/. Click “accept” to agree.
Determining an appropriate sample size for qualitative interviews to achieve true and near code saturation
Secondary analysis of data
Squire, C. M., Giombi, K. C., Rupert, D. J., Amoozegar, J., & Williams, P. (2024). Determining an appropriate sample size for qualitative interviews to achieve true and near code saturation: Secondary analysis of data. Journal of Medical Internet Research, 26(1), Article e52998. https://doi.org/10.2196/52998
Background: In-depth interviews are a common method of qualitative data collection, providing rich data on individuals' perceptions and behaviors that would be challenging to collect with quantitative methods. Researchers typically need to decide on sample size a priori. Although studies have assessed when saturation has been achieved, there is no agreement on the minimum number of interviews needed to achieve saturation. To date, most research on saturation has been based on in-person data collection. During the COVID-19 pandemic, web-based data collection became increasingly common, as traditional in-person data collection was possible. Researchers continue to use web-based data collection methods post the COVID-19 emergency, making it important to assess whether findings around saturation differ for in-person versus web-based interviews. Objective: We aimed to identify the number of web-based interviews needed to achieve true code saturation or near code saturation. Methods: The analyses for this study were based on data from 5 Food and Drug Administration-funded studies conducted through web-based platforms with patients with underlying medical conditions or with health care providers who provide primary or specialty care to patients. We extracted code- and interview-specific data and examined the data summaries to determine when true saturation or near saturation was reached. Results: The sample size used in the 5 studies ranged from 30 to 70 interviews. True saturation was reached after 91% to 100% (n=30-67) of planned interviews, whereas near saturation was reached after 33% to 60% (n=15-23) of planned interviews. Studies that relied heavily on deductive coding and studies that had a more structured interview guide reached both true saturation and near saturation sooner. We also examined the types of codes applied after near saturation had been reached. In 4 of the 5 studies, most of these codes represented previously established core concepts or themes. Codes representing newly identified concepts, other or miscellaneous responses (eg, "in general"), uncertainty or confusion (eg, "don't know"), or categorization for analysis (eg, correct as compared with incorrect) were less commonly applied after near saturation had been reached. Conclusions: This study provides support that near saturation may be a sufficient measure to target and that conducting additional interviews after that point may result in diminishing returns. Factors to consider in determining how many interviews to conduct include the structure and type of questions included in the interview guide, the coding structure, and the population under study. Studies with less structured interview guides, studies that rely heavily on inductive coding and analytic techniques, and studies that include populations that may be less knowledgeable about the topics discussed may require a larger sample size to reach an acceptable level of saturation. Our findings also build on previous studies looking at saturation for in-person data collection conducted at a small number of sites.