Graduate (S) Business Administration 502

STATISTICS FOR MANAGERS

Spring 2017
 
| HOME | SYLLABUS | CALENDAR | ASSIGNMENTS | ABOUT PROF. GIN |
 
B.  Regression

.

.

.

1. Multiple regression model

.

.

.

.

.

.

.

.

.

.

2. Methodology

  • Least squares method - minimize sum of squared differences of the actual values from the estimated values

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

3. Interpretation

.

.

.

.

.

.

Ex. - 

.

.

.

.

4. Dummy variables

  • Takes on a value of 1 (if criteria met) or 0 (otherwise)
  • Allow categorical data to be represented

Ex. - 

.

.

.

.

a. Interpretation

.

.

.

.

.

.

b. Multiple options

.

.

.

.

.

.

.

.

.

.

5.  Prediction

.

.

6. Measures of variation

a. Sum of squares

.

.

.

.

.

.

.

.

.

.

(1) Total sum of squares (SST)

.

(2) Regression sum of squares (SSR)

.

(3) Error sum of squares (SSE)

.

7. Evaluating the overall model

a. Coefficient of multiple determination

  • Proportion of the variation in variable Y explained by the multiple regression model involving the independent variables X1, X2, . . . XK

.

.

.

.

.

.

.

b. Adjusted r-square

.

.

.

.

.

.

c. F-test

.

.

.

.

.

.

.

.

.

.

8. Inferences about coefficients

a. Hypothesis tests

.

.

.

.

.

.

.

.

b. Confidence intervals

.

.

.

.

9. Assumptions and problems

  • Use residual analysis to evaluate violation of assumptions

a. Normality of errors

.

.

.

.

.

.

.

.

.

b. Homoscedasticity

  • Error terms have a constant variance for all values of X

.

.

.

.

.

.

.

.

.

.

  • Heteroscedasticity - violation of assumption of constant variance of error terms

.

.

.

.

.

.

.

.

.

.

(1)  Problems

  • Least-squares estimator is inefficient => standard error larger than otherwise => less likely to reject null hypothesis that the independent variable coefficient = 0

.

(2)  Evaluation

.

.

.

.

.

.

.

.

.

.

(3)  Dealing with heteroscedasticity

  • Use weighted least-squares

.

c. Independence of errors

  • Errors are unrelated

.

.

.

.

.

.

.

.

  • Autocorrelation - violation of assumption of independence of error terms

.

.

.

.

.

.

.

.

.

(1)  Problems

  • True variance of estimated coefficient is higher, but estimated variance is lower => more likely to reject null hypothesis

  • Model appears better than it actually is (R2 impacted too)

.

(2)  Evaluation

  • Use Durbin-Watson statistic

.

.

.

.

.

.

.

.

(3)  Dealing with autocorrelation

  • Use two-stage least-squares

.

d. Multicollinearity

  • Explanatory variables highly correlated

Ex. - 

.

.

(1)  Problems

  • Difficult to interpret coefficients of independent variables

.

(2)  Evaluation

  • Use variance inflationary factor (VIF)

.

.

.

.

.

.

.

.

(3)  Dealing with multicollinearity

  • Could drop some of the variables that are correlated

  • Problematic if theory says variables should be in model

.

10.  Logistic regression

  • Use categorical variable as the dependent variable

.

a.  Methodology

  • Use maximum likelihood estimation

.

.

.

.

.

.

.

.

b.  Interpretation

.

.

.

.

.

.

.

.

11. Ethical considerations

Problems if:

a. Variables intentionally omitted

b. Observations deleted without explanation

c. Assumptions not evaluated