Is the glass half empty or half full?: an experimental study of Bayesian versus frequentist statistics' influence on program endorsements by legislative staff
Date
2020
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The Bayesian and frequentist statistical paradigms are the two most well-known paradigms in the field of statistical inference. Bayesian statisticians consider inferences through increasing degrees of certainty where statistical results are presented as probability statements. On the other hand, the frequentist paradigm is best known for its use of p-values and confidence intervals to express results in terms of significance and precision. Despite the long-standing tradition of both these paradigms, the frequentist paradigm has for many years been the dominant paradigm in statistical training for social scientists. The reasons for the dominance of the frequentist paradigm are widely debated. In recent years however, the use of Bayesian methods by social scientists to analyze data has increased, including analyses intended to inform high stakes decision making. Some proponents of the Bayesian paradigm argue that probability statements from Bayesian analyses are easier to understand than confidence intervals and p-values. Recognizing that Bayesian methods are used in such high stakes fields like public policy research where the results of an impact evaluation are used to inform decisions such as whether or not to fund social programs (e.g., providing services to impoverished mothers, job training to disabled individuals, etc.), understanding the role Bayesian statistics might have in influencing statistical judgment is important. ☐ The claim that probability statements are easier to understand than confidence intervals and p-values is one that can be tested empirically. Early research in understanding the role Bayesian versus frequentist statistics have in influencing decision making regarding whether respondents would endorse a new educational technology is already underway (Chandler, Martinez, Finucane, Terziev, & Resch, 2019). This dissertation sought to expand upon this important work and conducted a statistical vignette experiment with United States congressional aides to understand whether legislative aides are more likely to endorse an education program when results from an effectiveness evaluation of that program are presented under a Bayesian versus frequentist paradigm. ☐ Thirty congressional aides were randomly assigned to one of four conditions. Each condition presented the same level of equivalent evidence in either the frequentist or Bayesian paradigm. Information regarding the cost of the program as well as the feasibility of implementation was incorporated into the vignettes resulting in two different cost and implementation scenarios (i.e., a low cost and difficult to implement scenario and a high cost and easy to implement scenario). Participants were asked to respond to six survey items. Two items aimed to assess whether respondents found the information presented in the vignettes as informative and easy to understand whereas the remaining four vignettes asked participants to rate their level of agreement with whether an endorsement of the program is justified. A second sample consisting of thirty-six undergraduates majoring in political science also participated in an identical vignette experiment. ☐ Results from the experiment were analyzed using both frequentist and Bayesian factorial repeated measures ANOVA models demonstrating statistically significant findings on all six dependent variables for the Congressional sample and three models showing statistically significant results in the sample of undergraduates. Bayes factor was used to interpret the results from the Bayesian ANOVA suggesting findings consistent with the frequentist results. All effects indicated more favorable reactions to the evidence when presented under the Bayesian paradigm. Implications for this work are discussed in terms of their application to statistical methods for social science research, and considerations for when to present results in a Bayesian versus frequentist framework are discussed. Finally, limitations of this work are addressed with respect to the need for future experiments to test other salient characteristics of statistical information in their vignettes as well as the need for replication of experimental results. Future directions for the work are suggested regarding the need to establish community consensus from researchers with respect to the best practices for presenting Bayesian statistical information to Congressional staffers. ☐ Keywords: Bayesian statistics, frequentist statistics, human judgement
Description
Keywords
Bayesian statistics, Frequentist statistics, Human judgement