Using AI for Survey Data Analysis and Collection

Conducting surveys can be expensive, and—with declining response rates—we’ve been on a mission to leverage solutions like administrative data and artificial intelligence (AI) to increase efficiencies, lower costs, and reduce data collection burden, without compromising quality.

In addition to using AI to reduce coding time, RTI International researchers are exploring new avenues for using this evolving technology to provide the best value to our federal, state, local, and nonprofit clients.

Continue reading for examples of how we’re modernizing our approach to survey data analysis and collection with SMART.

Using the SMART AI Application to Reduce Manual Labor on Surveys

Our team is currently using AI to reduce burden and processing time on the National Science Foundation’s Survey of Earned Doctorates (SED), an annual survey of doctorate graduates at accredited U.S. institutions that tracks trends in doctoral education. Every year, the SED provides valuable insights into respondents’ fields of study, student demographics, post-graduation employment plans, and more.

Once the survey has been conducted, we begin the task of reviewing respondents’ verbatim responses and coding them into relevant response categories. Manually reviewing individual entries and applying relevant codes from a taxonomy can be tedious, time-consuming, and error-prone, so coding standards require two staff members to code the strings independently and a third to review cases where the initial coders did not agree.

To address these challenges, the SED team began using SMART, an AI-assisted application developed by RTI in 2018. SMART uses modern natural language processing techniques to recommend relevant codes interactively as coders review and assign each text response. The application reads strings of text from each survey response and suggests the most likely categories based on the semantic distance text as well as the way similar strings have been coded in the past.

By automating coding and thematic analysis, AI enables researchers to gain actionable insights from survey data more quickly and efficiently. However, survey data collection will always require people; once SMART has identified the most likely categories, an RTI staff member still reviews the string and determines the best category, even if it is not any of the one’s selected by AI, to ensure accuracy.

Using AI to assist with text coding has sped up data processing on the SED significantly. Between 2022 and 2024, SMART reduced the manual labor spent on the SED coding process by approximately 55%. That equates to 303 hours of time savings, freeing up our researchers to focus on analysis and interpretation.

Ensuring Accuracy and Consistency in Survey Data Analysis with Intercoder Reliability

For SMART to be usable on our surveys, our team needed to ensure that the tool could provide intercoder reliability scores, a measure that looks at how the conclusions of two coders compare to each another, and enable an appropriate adjudication process where a third expert coder reviews and codes the cases where the initial coders disagreed. Many of RTI’s clients require intercoder reliability to ensure the accuracy and consistency of the survey coding process.

Since the application was first launched, SMART has been improved and updated so researchers can obtain intercoder statistics, compare results from the two coders, and then have a third coder conduct a review of the strings that were coded differently. This rigorous process helps with quality assurance and identifies areas where additional training may be warranted for the AI tool, the coders, or both.

Identifying Future Possibilities for AI in Surveys and Data Collection

Although RTI survey scientists are already leveraging AI on large surveys like the SED to meet clients’ research needs, this technology is rapidly evolving.

Our experts continue to identify and evaluate future ways AI can increase the efficiency and effectiveness of our surveys and data collection. Here are a few areas we are exploring:

Developing a Taxonomy of Disciplines from SED and ProQuest Data: Each survey cycle, the SED collects data on respondents’ dissertation field or fields of study. Currently, SED respondents are asked to code their field of study using a set of options, otherwise known as a taxonomy, based on the U.S. Department of Education’s Classification of Instructional Programs (CIP). However, as its title reflects, this taxonomy was built to categorize fields of instruction rather than research topics. Our researchers are using AI to analyze dissertation titles and abstracts to develop a novel taxonomy of research topics that will make it easier for SED respondents to answer this set of questions and improve the depth and utility of these SED data.
Exploring Emerging AI Tools to Automate the Review of Respondent Comments: Each year, our survey researchers spend about 60 hours manually reviewing open-ended text. Using AI, our team hopes to automate this process and reduce the time spent reviewing respondent comments to as little as 8 hours, an estimated 80% reduction in manual time. Most surveys have open-ended text, so this solution could be applied across RTI’s survey portfolio to increase efficiency.
Using Survey Paradata to Optimize Survey Operations: Our team is exploring the possibility of using survey paradata, data about the process of collecting survey data, to assist with survey operations. AI can analyze this information (e.g., how long it took a respondent to complete the survey, how many outreach attempts were made) and provide recommendations to researchers about the best time to contact respondents. This, in turn, can help improve survey quality and response rates.

Soon, AI may be used to solve challenges that appeared insurmountable even a few years ago. Learn more about how we are applying AI to address this evolving research landscape and enhance our surveys in new and innovative ways.

Learn how SMART can increase efficiency on your surveys and data collection.

Disclaimer: This piece was written by Peter Einaudi (Director, Education Research) to share perspectives on a topic of interest. Expression of opinions within are those of the author or authors.

What Can AI Do for Survey Efficiency

Using the SMART AI Application to Reduce Manual Labor on Surveys

Ensuring Accuracy and Consistency in Survey Data Analysis with Intercoder Reliability

Identifying Future Possibilities for AI in Surveys and Data Collection