

When large numbers of tests are performed, some produce false results, hence 5% of randomly chosen hypotheses turn out to be significant at the 5% level, 1% turn out to be significant at the 1% significance level, and so on, by chance alone. Conventional tests of statistical significance are based on the probability that an observation arose by chance, and necessarily accept some risk of mistaken test results, called the significance. This is done by performing many statistical tests on the data and only reporting those that come back with significant results.

Data dredging (data fishing, data snooping, equation fitting, p-hacking) is the misuse of data analysis to find patterns in data that can be presented as statistically significant, thus dramatically increasing and understating the risk of false positives.
