You just completed a research experiment, did the data analysis, and found that you have a large effect size that is significant at the .05 level. “This is great!” you say to yourself as you start to prepare a submission of your findings for Science. Is it possible, however, that the true effect size; that is, the effect size in the population, is different from the effect size you calculated for your sample? “Sure, but it couldn’t be too different,” you may say. Think again. The plot below shows that a sample effect size may deviate substantially from the population effect size. In fact, you may find a very large effect in your data when there is actually no effect in the population.
The plot was made by simulating 80,000 datasets containing two groups that had either 10 or 30 observations each and had a population standardized mean difference (Cohen’s d) of either 0, 0.20, 0.50, or 0.80. For each dataset simulated, an effect size was calculated (sample effect size), and these effect sizes were plotted against the population effect sizes. Each point in the plot represents the effect size calculated for a single dataset.
This plot demonstrates two important points. First, it shows that a sample effect size can deviate substantially from the true effect size in the population. This is why you should always provide confidence intervals when presenting any findings: Because confidence intervals tell you what values for the population effect size estimate are plausible. Second, it shows that a sample effect size’s deviation from the population effect size tends to be less the larger the sample size is. This is why it’s almost always a good idea to use large sample sizes: The larger the sample size, the more precise your estimate of the population effect size is.
For those interested, here’s the R code that was used to both simulate the data and create the plot: effect-size-r-code.txt. Be warned, though, that running the code may take some time!