This is another blog post about post-hoc power. it was created by ChatGPT after a discussion with ChatGPT about post-hoc power. You can find the longer discussion at the end of the blog post.
🔍 Introduction
You finish your study, run the stats, and the p-value is… not significant. What next?
Maybe you ask, “Did I just not have enough power to detect an effect?”
So you calculate post-hoc power — also called observed power — to figure out whether your study was doomed from the start.
But here’s the problem:
Post-hoc power doesn’t tell you what you think it does.
This post walks through why that’s the case — and what to do instead.
⚡ What Is Post-Hoc (Observed) Power?
Post-hoc power is a calculation of statistical power after your study is complete, using the effect size you just observed.
It answers the question:
“If the true effect size were exactly what I observed, how likely was I to find a significant result?”
It seems intuitive — but it’s built on shaky ground.
🚨 Why Post-Hoc Power Is Misleading
The main issue is circular logic.
Post-hoc power is based on your observed effect size. But in any given study, your observed effect size includes sampling error — sometimes wildly so, especially with small samples.
So if you got a small, non-significant effect, post-hoc power will always be low — but that doesn’t mean your study couldn’t detect a meaningful effect. It just means it didn’t, and now you’re using that fact to “prove” it couldn’t.
👉 In essence, post-hoc power just repackages your p-value. It doesn’t add new information.
🤔 But What If I Want to Know About Power?
Here’s where things get interesting.
Power analysis is still important — but it needs to be handled differently. The key distinction is between hypothetical power and observed power:
Type of Power Based on When Used Purpose
Hypothetical Expected (e.g., theoretical or meta-analytic) effect size Before study To design the study
Observed Effect size from current data After study Often (wrongly) used to explain significance
But you can do something more useful with observed data…
✅ A Better Way: Confidence Intervals for Power
Rather than calculating a single post-hoc power number, calculate a confidence interval for the effect size, and then use that to compute a range of plausible power values.
Example:
Let’s say you observed an effect size of 0.3, with a 95% CI of [0.05, 0.55].
You can compute:
Power if the true effect is 0.05 (low power)
Power if the true effect is 0.55 (high power)
Now you can say:
“If the true effect lies within our 95% CI, then the power of our study ranged from 12% to 88%.”
That’s honest. It tells you what your data can say — and what they can’t.
🧪 When Are Power Confidence Intervals Informative?
In small studies, the confidence interval for the effect size (and thus the power) will be wide — too wide to draw firm conclusions.
But if you base your effect size estimate on:
a large study, or
a meta-analysis,
your confidence interval can be narrow enough that the corresponding power range is actually informative.
✔️ Bottom line: Confidence intervals make power analysis meaningful — but only when your effect size estimate is precise.
💡 Final Thought: Use Power Thoughtfully
If you didn’t find a significant result, it’s tempting to reach for post-hoc power to explain it away.
But instead of asking, “Was my study underpowered?” try asking:
“What effect sizes are consistent with my data?”
“How much power would I have had for those?”
“What sample size would I need to detect effects in that range reliably?”
These are the questions that lead to better science — and more replicable results.
🛠️ TL;DR
❌ Post-hoc power (observed power) is often misleading.
🔁 It restates your p-value using your observed effect size.
✅ Better: Use the 95% CI of your effect size to calculate a range of power estimates.
📏 If your effect size estimate is precise (e.g., from a large or meta-analytic study), this range becomes actionable.