Theodore Frick
Tyler Dodge
Xiaojing Liu
Bude Su
Indiana University
May, 2004
See a Web Tool for Bayesian reasoning
How can one determine efficiently if a Website is working well? Relatively small numbers of the target audience are needed to improve a product during formative evaluation and usability testing as part of product development and revision cycles. However, during summative evaluation, how many subjects are needed to determine product effectiveness?
When investigating the number of subjects needed for usability tests, a Poisson probability model has been found to be a reasonable fit to extant data. However, this model was chosen based on the number of subjects needed to identify important usability problems with a product, not for determining its effectiveness. To determine if a Website is working well, we investigated the predictive validity of a discrete Bayesian decision model: the Sequential Probability Ratio Test (SPRT) originally developed by Wald (1947).
Fifty-one people representing a campus community participated in a usability test of the university library online catalog search tool, and the results were analyzed post hoc with SPRT re-enactments to simulate sequential decision making after testing each subject. Across a range of parameters, the Bayesian SPRT reached the same conclusion as reflected by the entire sample with many fewer subjects, utilizing typically small α and β error rates. The study provides evidence of the usefulness of the SPRT decision model in situations where determination of effectiveness is the goal (whether a product works well or not). The SPRT maximizes efficiency by testing only as many users as are necessary to reach a confident conclusion.
The full paper is available as a PDF document.