💡
The most intuitive way to describe the data mining problem to a nonstatistician is through what is called the birthday paradox, though it is not really a paradox, simply a perceptional oddity. If you meet someone randomly, there is a one in 365.25 chance of your sharing their birthday, and a considerably smaller one of having the exact birthday of the same year. So, sharing the same birthday would be a coincidental event that you would discuss at the dinner table. Now let us look at a situation where there are 23 people in a room. What is the chance of there being 2 people with the same birthday? About 50%. For we are not specifying which people need to share a birthday; any pair works.