-The Central
Limit Theorem (CLT)
is one of the greatest statistical insights. It states that no matter the underlying distribution of the dataset, the sampling distribution of the means would approximate a normal distribution.
-Moreover, the mean of the sampling
distribution would be equal to the mean of the original distribution and the
variance would be n times
smaller, where n is the size of the
samples. The CLT applies
whenever we have a sum or an average of many variables (e.g. sum of rolled numbers when rolling dice).
Why is it useful?
The
CLT allows us to assume normality
for many different variables. That is
very useful for confidence intervals,
hypothesis testing, and regression analysis. In fact, the Normal
distribution is so predominantly observed around
us
due to the fact that following the CLT, many variables converge to Normal.
Click here for a CLT
simulator.
Where can we see it?
Since many concepts and events are a sum or an average of
different effects, CLT applies and we observe normality all the time. For example, in regression analysis, the dependent variable is explained through the sum of error terms.