Probability versus Likelihood

I sort of understand that you can’t state a probability for whether or not the population mean is in the interval because it either is or it isn’t (or at least I read that, but I don’t think I really understand it). If there is a 90% chance that the interval contains the population mean, I’m not really clear as to why this is not also the probability because doesn’t that mean there is a 10% chance that it does not contain the population mean, giving you a 0.9 probability that it does contain it? 

One of my students

The answer, somewhat unfortunately/unintuitively/confusingly lies in semantics. The population mean is some value – we just dont know what it is. Therefore, there’s no random chance about it. It straight up is equal to some value. And here’s where more confusion comes into play. 

Probability and Likelihood are two different concepts. They’re basically opposite directions on the same two-way street. When we know population parameters and we’re examining outcomes from the population, that is probability. When we know a sample and we’re examining population parameters, that is likelihood. So, you can say something like ‘given that the mean of a distribution is 15, the probability that a random sample of size 12 has a sample mean greater than 18 is 0.342 (made up number)’. And, you can say something like, ‘given that we observed a sample mean of 42 from 20 observations, the likelihood that the true population mean is greater than 40 is 0.712 (made up number)’. The difference is super subtle, and I’ll admit, probably doesn’t really matter at the end of the day, and the authors of the textbook pull a tricky one by using the word sure and contrasting it with probability, without making the difference explicit (which it absolutely intuitively is not). But, this is why if we’re going by the book, we can’t say something like ‘theres a 90% chance the true mean is in our interval’, because its not even in the realm of probability to discuss the behaviour of population parameters given observed samples. We have to say things like ‘its 90% likely that the true mean is in our interval’. 

Again, I admit, does this distinction really make a difference, a practically significant difference? No, probably not (no pun intended). But if you understand that using population parameters to describe the behaviour of samples and using observed samples to talk about population parameters are two different things, kind of like opposite directions on the same two-way street, then you’re in good shape. It’s just a matter of knowing which side of the street you’re driving on, and what the specific verbiage/jargon is to use on that side of the street.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s