Tag: modelling

  • The beta-binomial distribution

    The beta-binomial model is a Bayesian model used to analyze rates. For a great derivation and explanation of this model, I highly recommend watching the second lecture from Richard McElreath’s course Statistical Rethinking. In this model, the data, \(X\), is assumed to be binomially distributed with a fixed number of trail \(N\) but an unknown rate \(\rho \in [0,1]\). The rate \(\rho\) is given a \(\text{Beta}(a,b)\) prior. That is the prior distribution of \(\rho\) has a density

    \(p(\rho) = \frac{1}{B(a,b)} \rho^{a-1}(1-\rho)^{b-1},\)

    where \(B(a,b) =\int_0^1 \rho^{a-1}(1-\rho)^{b-1}d\rho\) is a normalizing constant. The model can thus be written as

    \(\rho \sim \text{Beta}(a,b),\)
    \(X | \rho \sim \text{Binom}(N,\rho).\)

    This is a conjugate model, meaning that the posterior distribution of \(\rho\) is again a beta distribution. This can be seen by using Bayes rule

    \(p(\rho | X) \propto p(X| \rho)p(\rho) \propto \rho^X(1-\rho)^{N-X}\rho^{a-1}(1-\rho)^{b-1}=\rho^{X+a-1}(1-\rho)^{(N-X)+b-1}.\)

    The last expression is proportional to a beta density., specifically \(\rho | X \sim \text{Beta}(X+a, N-X+b)\).

    The marginal distribution of \(X\)

    In the above model we are given the distribution of \(\rho\) and the conditional distribution of \(X|\rho\). To calculate the distribution of \(X\), we thus need to marginalize over \(\rho\). Specifically,

    \(\displaystyle{p(X) = \int_0^1 p(X,\rho)d\rho = \int_0^1 p(X| \rho)p(\rho)d\rho.}\)

    The term inside the above integral is

    \(\displaystyle{p(X| \rho)p(\rho) = \binom{N}{X}\rho^X(1-\rho)^{N-X}\frac{1}{B(a,b)}\rho^{a-1}(1-\rho)^{b-1} = \frac{\binom{N}{X}}{B(a,b)}\rho^{X+a-1}(1-\rho)^{N-X+b-1} }.\)

    Thus,

    \(\displaystyle{p(X) = \frac{\binom{N}{X}}{B(a,b)} \int_0^1 \rho^{X+a-1}(1-\rho)^{N-X+b-1}d\rho = \binom{N}{X}\frac{B(X+a, N-X+a)}{B(a,b)}}.\)

    This distribution is called the beta-binomial distribution. Below is an image from Wikipedia showing a graph of \(p(X)\) for \(N=10\) and a number of different values of \(a\) and \(b\). You can see that, especially for small value of \(a\) and \(b\) the distribution is a lot more spread out than the binomial distribution. This is because there is randomness coming from both \(\rho\) and the binomial conditional distribution.

    A plot of the beta-binomial distribution for different values of the parameters a and b. For small values of a and b, the distribution is very spread out.