Why we need a prior when computing a Bayes Factor (R code provided)?

by rnorouzian   Last Updated June 21, 2017 01:19 AM

I'm new to Bayesian statistics and have a fundamental question about "Bayes Factors". Specifically, I understand if we want to compute the posterior probability of a hypothesis (e.g., $p(H_1 |Data)$, we need a prior probability (i.e., $p(H_1)$) for the hypothesis in question as per Bayes' rule.

BUT I'm wondering when computing only a "Bayes Factor" which is a factor by which we update our prior belief about the hypothesis, why we need a prior for the determination of $H_1$. In other words, when we do NOT want to compute any posterior probability, why we talk about a prior when computing a Bayes Factor?

In fact, I even want to know what does the mathematical integration for the following 2 $H_1$s (one with a prior and the other without a prior) that I have used below (R code) exactly do?:

## With a Cauchy prior: 

H1 = integrate(function(delta) dcauchy(delta, 0, sqrt(2)/2) * dt(2.6, 98, delta*sqrt(20)), -Inf, Inf)[[1]]
# > 0.06127036

## Without any kind of prior:

H11 = integrate(function(delta) dt(2.6, 98, delta*sqrt(20)), -Inf, Inf)[[1]]    
# > 0.2230371


## H0     
H0 = dt(2.6, 98)

## BF10:

BF10 = H1 / H0

BF10 = H11 / H0


Answers 1


In general you use Bayes factors to determine the odds of one hypothesis with respect to another hypothesis. In determining both probabilities you need to calculate an integral or a sum if you put prior probability on a finite number of points. The tricky piece with the math is that it needs to be general and in general you cannot always ensure that the integral you need to calculate will be finite. If you choose a proper prior then you can always ensure that the integral is finite. Improper priors are tricky and they repeatedly show up in Bayesian statistics. For example, if you do not choose proper priors in a Normal random effects model (an ANOVA model) you can get an improper posterior and you might not know it from your Gibbs sampling output. The same sort of thing happens with Bayes factors. In general though I'd watch out for Bayes factors, they have some conceptual challenges similar to p-values-the technical term is coherence-and other techniques are recommended. For more on this I'd recommend a paper by Michael Levine titled "Bayes Factors what they are and what they are not". Hopefully this helps you understand Bayes factors a bit more.

It is hard to comment on your code because the full context is not in the post. If you say you do not have priors, actually you do but they aren't showing up because they are constants with respect to the parameter(s) you place priors over so the priors are improper (do not integrate to 1) if you are using non-compact parameter spaces as your t-distribution example implies.

user3164100
user3164100
June 21, 2017 01:04 AM

Related Questions


Integration of a conditional probability

Updated June 30, 2017 12:19 PM

Recursion for multiple integration in R-software

Updated October 22, 2017 07:19 AM

about the definition of bayesian network

Updated April 19, 2015 21:08 PM