MKTG90011 · Marketing Research
Regression Analysis
Regression does three jobs: it tells you (1) whether each independent variable has a significant relationship with a metric dependent variable, (2) the relative strength of each effect, and (3) it makes predictions. The unit teaches ordinary least-squares (OLS) linear regression — the line that minimises the sum of squared residuals. Bivariate regression has one predictor (Y = b₀ + b₁X + e); multiple regression has several (Y = b₀ + b₁X₁ + b₂X₂ + … + e). The intercept b₀ is the constant; each slope b is the change in Y for a one-unit change in that X, holding the others fixed; e is the residual, the gap between actual and predicted. Reading SPSS regression output always follows the same three tables in order: Model Summary (R² — how much variance is explained), ANOVA (is the overall model significant?), then Coefficients (which predictors matter, their Beta and Sig.). Categorical predictors enter as dummy variables, and multicollinearity — predictors too correlated with each other — is checked with the VIF. In the project this is the mandated H5: multiple regression.
What this chapter covers
- 0110.1 What regression does — explain, rank, predict
- 02The OLS line and the regression equation
- 03Bivariate vs multiple regression
- 04Reading the output: Model Summary → ANOVA → Coefficients
- 05R², standardised Beta and Sig.
- 06Dummy variables for categorical predictors
- 07Assumptions and multicollinearity (the VIF)
Worked example: read a multiple-regression output
- +1(a) Overall model. Read the ANOVA table: Sig. = .000 < .05, so the model as a whole significantly predicts repurchase intention.
- +1(b) Which predictors. Read Coefficients: satisfaction is significant (Sig. = .000) with the larger Beta (.52); price fairness is not significant (Sig. = .18 > .05), so it adds little.
- +1Rank the effects. Compare standardised Beta: satisfaction (.52) is the dominant driver; price fairness (.11, non-significant) can be dropped.
- +1(c) Interpret R². R² = 0.41 means the model explains 41% of the variance in repurchase intention; the rest is other factors and noise.
Key terms
- OLS regression
- Ordinary least-squares regression fits the line that minimises the sum of squared residuals — the vertical gaps between actual and predicted values — to predict a metric dependent variable from one or more predictors.
- Regression coefficient (b / Beta)
- Each slope b is the change in Y for a one-unit change in that predictor, holding the others fixed. The standardised Beta is on a common scale, so it lets you compare the relative strength of predictors.
- R-squared (R²)
- The proportion of variance in the dependent variable explained by the model, from 0 to 1. R² = 0.41 means the predictors jointly explain 41% of the variation; adjusted R² penalises adding useless predictors.
- Dummy variable
- A 0/1 variable that lets a categorical predictor enter a regression. A k-category variable needs k−1 dummies; each dummy's coefficient compares its category to the omitted reference category.
- Multicollinearity / VIF
- When predictors are too correlated with each other, their individual coefficients become unstable and hard to interpret. The variance inflation factor (VIF) flags it; high VIF (commonly > 5–10) signals a problem.
Regression Analysis FAQ
When do I use regression rather than correlation?
Use correlation to describe the strength and direction of association between two metric variables; use regression when you want to predict a metric outcome, quantify each predictor's effect, or include several predictors at once. Regression also lets you control for other variables, which a simple correlation cannot.
In what order do I read the SPSS regression output?
Always Model Summary → ANOVA → Coefficients. The Model Summary gives R² (how much variance is explained); the ANOVA tells you whether the model is significant overall; the Coefficients table tells you which individual predictors matter (Sig. < .05) and their relative strength (standardised Beta).
How do I put a categorical variable into a regression?
Convert it into dummy variables — 0/1 indicators. A categorical variable with k categories needs k−1 dummies, with one category left out as the reference; each dummy's coefficient then compares its category against that reference. You cannot enter a raw nominal code as if it were a number.
What is multicollinearity and why does it matter?
It is when predictors are highly correlated with each other, which inflates the standard errors of their coefficients and makes individual effects unreliable and hard to interpret — even when the overall model fits well. Check the VIF; if it is high, drop or combine the offending predictors (PCA can help).
Exam move
Practise reading SPSS regression output as a fixed routine — Model Summary → ANOVA → Coefficients — because exam items hand you the tables and ask for the right interpretation, not the computation. Be able to interpret R² as percentage of variance explained, judge a predictor by its Sig. (< .05) and rank predictors by standardised Beta. Know the which-test trigger (predict a metric DV from 1 IV → bivariate, from 2+ → multiple), how dummy variables encode categories, and that the VIF flags multicollinearity. This is your project's H5, so the same skills earn marks twice.