Linear Regression Models (II). Week 8

Слайд 2

Lecture outline Assumptions of Linear Regression R Squared and Adjusted R

Lecture outline
Assumptions of Linear Regression
R Squared and Adjusted R Squared
F-test for

model significance
t-test for parameter significance
Слайд 3

Normality: Multiple regression assumes that the error terms are normally distributed.

Normality: Multiple regression assumes that the error terms are normally distributed.
Linearity:

There must be linear relationship between response variable and independent variables (Scatterplots).
No Multicollinearity: the independent variables are not highly correlated with each other (Correlation matrix).
Homoscedasticity: the variance of error terms are similar across the values of the independent variables (Plot of residuals vs predictor variables). 

Assumptions of Linear Regression

Слайд 4

Normality Normality: Multiple regression assumes that the error terms are normally

Normality

Normality: Multiple regression assumes that the error terms are normally distributed.
Plot

QQ (Quantile-quantile) plots are
used to visually check the
normality of the data.

R Syntax:
plot(model$residuals)

As all the points fall approximately along the
straight line, we can assume normality.

Слайд 5

Linearity Linearity: There must be linear relationship between response variable and independent variables.

Linearity

Linearity:
There must be linear
relationship between
response variable and
independent

variables.
Слайд 6

No Multicollinearity No Multicollinearity: The independent variables are not highly correlated with each other.

No Multicollinearity

No Multicollinearity:
The independent variables are not highly correlated with

each other.
Слайд 7

Homoscedasticity Homoscedasticity: The variance of error terms are similar across the

Homoscedasticity

Homoscedasticity:
The variance of error terms are similar across the values

of the independent variables (Plot of residuals vs predictor variables).

par(mfrow=c(1,2))
plot(Carseats$Income,model$residuals)
plot(Carseats$Advertising, model$residuals)

Слайд 8

R-Squared R-squared (R2), also known as a Coefficient of Determination, is

R-Squared

R-squared (R2), also known as a Coefficient of Determination, is a

statistical measure that represents the proportion of the variance for a dependent variable that's explained by an independent variable or variables in a regression model. 
Слайд 9

Слайд 10

Слайд 11

Adjusted R-Squared

Adjusted R-Squared

 

Слайд 12

Testing for Significance: F-test The F test is referred to as

Testing for Significance: F-test

The F test is referred to as

the test for overall
significance.

The F test is used to determine whether a significant
relationship exists between the dependent variable
and the set of all the independent variables.

Слайд 13

Hypotheses Rejection Rule Test Statistics Testing for Significance: F-test H0: β1

Hypotheses

Rejection Rule

Test Statistics

Testing for Significance: F-test

H0: β1 = β2 =

. . . = βk = 0
Ha: At least one of the parameters (betas) is not equal to zero.

F = MSR/MSE

Reject H0 if p-value < α or if F > Fα
where Fα is based on an F distribution
with k d.f. in the numerator and
n - k - 1 d.f. in the denominator.

Слайд 14

Example

Example

 

Слайд 15

Hypotheses Rejection Rule Test Statistics Testing for Significance: t-test H0: βi

Hypotheses

Rejection Rule

Test Statistics

Testing for Significance: t-test

H0: βi = 0
Ha:

βi ≠ 0

 

Reject H0 if p-value < α or
if |t| > tα/2 where tα/2
is based on a t distribution
with n - k - 1 degrees of freedom.

Слайд 16

Example

Example

 

Слайд 17

Exercise 1 38 random movies were selected to develop a model

Exercise 1

38 random movies were selected to develop a model for

predicting their revenues.
We have the following variables in the dataset:
USRevenue – movie’s revenue in the US (mln$)
Rating – restrictions based on age (PG, PG-13, R)
Budget – budget (expenditure) of the movie (mln$)
Opening – revenue on the opening weekend (mln$)
Theaters – number of theaters the movie was in for the opening weekend
Opinion – IMDb rating (1 to 10, 10 being the best)
Слайд 18

Exercise 1

Exercise 1

 

Слайд 19

THE END Thank you for your attention!

THE END
Thank you for your attention!