RTI uses cookies to offer you the best experience online. By clicking “accept” on this website, you opt in and you agree to the use of cookies. If you would like to know more about how RTI uses cookies and how to manage them please view our Privacy Policy here. You can “opt out” or change your mind by visiting: http://optout.aboutads.info/. Click “accept” to agree.
Thresholding nonprobability units in combined data for efficient domain estimation
Savitsky, T. D., Williams, M. R., Beresovsky, V., & Gershunskaya, J. (2025). Thresholding nonprobability units in combined data for efficient domain estimation. Statistics in Transition New Series, 26(2). https://doi.org/10.59139/stattrans-2025-013
Quasi-randomization approaches estimate latent participation probabilities for units from a nonprobability / convenience sample. Estimation of participation probabilities for convenience units allows their combination with units from the randomized survey sample to form a survey-weighted domain estimate. One leverages convenience units for domain estimation under the expectation that estimation precision and bias will improve relative to solely using the survey sample; however, convenience sample units that are very different in their covariate support from the survey sample units may inflate estimation bias or variance. This paper develops a method to threshold or exclude convenience units to minimize the variance of the resulting survey-weighted domain estimator. We compare our thresholding method with other thresholding constructions in a simulation study for two classes of datasets based on the degree of overlap between survey and convenience samples on covariate support. We reveal that excluding convenience units that each express a low probability of appearing in both reference and convenience samples reduces estimation error.
RTI shares its evidence-based research - through peer-reviewed publications and media - to ensure that it is accessible for others to build on, in line with our mission and scientific standards.