As you have just seen, running regression in Excel is easy because all calculations are preformed

automatically. The interpretation of the results is a bit trickier because you need to know what is

behind each number. Below you will find a breakdown of 4 major parts of the regression

analysis output

**Regression analysis output: Summary Output**

This part tells you how well the calculated linear regression equation fits your source data.

Here's what each piece of information means:

**
Multiple R.** It is the Correlation Coefficient that measures the strength of a linear relationship

between two variables. The correlation coefficient can be any value between -1 and 1, and its

absolute value indicates the relationship strength. The larger the absolute value, the stronger the

relationship:

- 1 means a strong positive relationship

-1 means a strong negative relationship

-0 means no relationship at all

**R Square. **It is the Coefficient of Determination, which is used as an indicator of the goodness of

fit. It shows how many points fall on the regression line. The R2

value is calculated from the total

sum of squares, more precisely, it is the sum of the squared deviations of the original data from

the mean.

In our example, R2

is 0.91 (rounded to 2 digits), which is fairy good. It means that 91% of our

values fit the regression analysis model. In other words, 91% of the dependent variables (y-values) are explained by the independent variables (x-values). Generally, R Squared of 95% or

more is considered a good fit.

**Adjusted R Square.** It is the R square adjusted for the number of independent variable in the

model. You will want to use this value instead of R square for multiple regression analysis.

**
Standard Error**. It is another goodness-of-fit measure that shows the precision of your

regression analysis - the smaller the number, the more certain you can be about your regression

equation. While R2

represents the percentage of the dependent variables variance that is

explained by the model, Standard Error is an absolute measure that shows the average distance

that the data points fall from the regression line.

**Observations. **It is simply the number of observations in your model

**Regression analysis output: ANOVA
**

The second part of the output is Analysis of Variance (ANOVA):

Basically, it splits the sum of squares into individual components that give information about the

levels of variability within your regression model:

- df is the number of the degrees of freedom associated with the sources of variance.

- SS is the sum of squares. The smaller the Residual SS compared with the Total SS, the

better your model fits the data.

-MS is the mean square.

- F is the F statistic, or F-test for the null hypothesis. It is used to test the overall

significance of the model.

- Significance F is the P-value of F.

The ANOVA part is rarely used for a simple linear regression analysis in Excel, but you should

definitely have a close look at the last component. The Significance F value gives an idea of

how reliable (statistically significant) your results are. If Significance F is less than 0.05 (5%),

your model is OK. If it is greater than 0.05, you'd probably better choose another independent

variable.

**Regression analysis output: coefficients**

This section provides specific information about the components of your analysis:

The most useful component in this section is **Coefficients**. It enables you to build a linear regression equation in Excel:

y = bx + a

For our data set, where y is the number of umbrellas sold and x is an average monthly rainfall,

our linear regression formula goes as follows:

Y = Rainfall Coefficient * x + Intercept

Equipped with a and b values rounded to three decimal places, it turns into:

Y=0.45*x-19.074

For example, with the average monthly rainfall equal to 82 mm, the umbrella sales would be

approximately 17.8:

0.45*82-19.074=17.8

In a similar manner, you can find out how many umbrellas are going to be sold with any other

monthly rainfall (x variable) you specify.

**Regression analysis output: residuals**

If you compare the estimated and actual number of sold umbrellas corresponding to the monthly

rainfall of 82 mm, you will see that these numbers are slightly different:

- Estimated: 17.8 (calculated above)

- Actual: 15 (row 2 of the source data)

Why's the difference? Because independent variables are never perfect predictors of the

dependent variables. And the residuals can help you understand how far away the actual values

are from the predicted values:

For the first data point (rainfall of 82 mm), the residual is approximately -2.8. So, we add this

number to the predicted value, and get the actual value: 17.8 - 2.8 = 15.

1- Types of data and level of measurement
2- Graphs and Tables that Represent Categorical Variables
3- Excel formulas
4- Graphs and tables that represent numerical variables
5- Graphs and Tables for Relationships Between Variables.
6- Mean, Median, Mode
7- Variance and Standard Deviation
8- Covariance and Correlation
9- Distributions
10- The Central Limit Theorem
11- Estimators and Estimates
12- Confidence Intervals and the Margin of Error
13- Studentâ€™s T Distribution
14- Formulas for Confidence Intervals
15- Scientific method
16- Hypotheses
17- Decisions You Can Take
18- Statistical Errors (Type I Error and Type II Error)
19- P-Value
20- Formulae for Hypothesis Testing
21- Basics
22- Linear regression equation
23- How to do linear regression in Excel with Analysis ToolPak
24- Interpret regression analysis output
25- How to make a linear regression graph in Excel
26- How to do regression in Excel using formulas