A principled Bayesian contingency table test.
Fisher's exact test is often considered the gold standard for testing the statistical significance of contingency tables in cases of small sample sizes. However, the test conditions on the marginals being fixed (such as the classic "Lady Tasting Tea", where the number of tea preparations in each category), and as such is not appropriate for situations where one is comparing two binomial distributions. In addition, simulations have demonstrated that in practice the test often performs poorly [1].
Jasjeet Sekhon proposed a Bayesian alternative to Fisher's exact test which avoids these issues. This software is an implementation of his beta difference distribution solution. Primarily, this is a python module, but I have thrown in an R implementation as well.
If you have python installed, you should be able to download this source and use python setup.py install
to install the module within your python environment.
Building this module and installing to your python distribution should give you access to the sekhon.py
CLU. For usage, type sekhon.py -h
.
Sample usage:
import sekhon
# Simple test of table
# a b
# c d
# where a, b are the successes and failures in category X1, etc.
sekhon.test(a, b, c, d)
# Specifically test the probability that `P1 - P2 > prob_diff`
sekhon.test(a, b, c, d, prob_diff=0.05)
Additionally, if you are feeling playful, you can toy around with an early sampling solution using the simulation
and simulation_convergence_test
functions.
For now Copy-Paste from sekhon.R
. If there is enough demand, I can throw together a little R package though.
[1] Jasjeet S. Sekhon, 2005 "Making Inferences from 2×2 Tables: The Inadequacy of the Fisher Exact Test for Observational Data and a Principled Bayesian Alternative." [pdf]