Regression Analysis

Mooi, Erik; Sarstedt, Marko; Mooi-Reci, Irma

doi:10.1007/978-981-10-5218-7_7

Erik Mooi⁴,
Marko Sarstedt⁵ &
Irma Mooi-Reci⁶

Part of the book series: Springer Texts in Business and Economics ((STBE))

158k Accesses
3 Citations

Abstract

We first provide comprehensive, but simple, access to essential regression knowledge by discussing how regression analysis works, the requirements and assumptions on which it relies, and how you can specify a regression analysis model that allows you to make critical decisions for your business, clients, or project. Each step involved in regression analysis is linked to its execution in Stata (using menus and code). We show how to use a range of Stata’s easy-to-learn statistical procedures that underlie regression analysis, which will allow you to analyze, chart, and validate regression analysis results and to assess your analysis’s robustness. Interpretation of Stata output can be difficult, but we make this easier by means of an annotated case study. We conclude with suggestions for further readings on the use, application, and interpretation of regression analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Strictly speaking, the difference between the predicted and the observed y-values is \( \widehat{e} \).
2.
This only applies to the standardized βs.
3.
This is only a requirement if you are interested in the regression coefficients, which is the dominant use of regression. If you are only interested in prediction, collinearity is not important.
4.
The VIF is calculated using a completely separate regression analysis. In this regression analysis, the variable for which the VIF is calculated is regarded as a dependent variable and all other independent variables are regarded as independents. The R² that this model provides is deducted from 1 and the reciprocal value of this sum (i.e., 1/(1 − R²)) is the VIF. The VIF is therefore an indication of how much the regression model explains one independent variable. If the other variables explain much of the variance (the VIF is larger than 10), collinearity is likely a problem.
5.
This term can be calculated manually, but also by using the function mmult in Microsoft Excel where x ^T x is calculated. Once this matrix has been calculated, you can use the minverse function to arrive at (x ^T x)⁻¹ .
6.
In Stata this can be done by using the, robust option.
7.
The test also includes the predicted values squared and to the power of three.
8.
Specifically, in the mentioned regression model y = α + β ₁ x ₁ + β ₂ x ₂ + β ₃ x ₃ + e, the Breusch-Pagan test determines whether \( \widehat{e^2}=\alpha +{\beta}_{BP1}{x}_1+{\beta}_{BP2}{x}_2+{\beta}_{BP3}{x}_3+{e}_{BP} \).
9.
This hypothesis can also be read as that a model with only an intercept is sufficient.
10.
The AIC is specifically calculated as AIC = n·ln(SS _E/n) + 2·k, where n is the number of observations and k the number of independent variables, while the BIC is calculated as BIC = n·ln(SS _E/n) + k·ln(n).
11.
Cohen’s (1994) classical article “The Earth is Round (p < 0.05)” offers an interesting perspective on significance and effect sizes.
12.
Using the Stata command egen commitment=rowmean(com1 com2 com3)
13.
Note that a p-value is never exactly zero, but has values different from zero in later decimal places.
14.
Note that it is possible to show all categories for regression tables by typing set showbaselevels on. This can be made permanent by typing set showbaselevels on, permanent.
15.
Note that while the constant has the highest value (1.19), this is not a coefficient and should not be interpreted as an effect size.
16.
Please note that only Stata 13 or above feature built-in routines to calculate η ².
17.
The seed specifies the initial value of the random-number generating process such that it can be replicated later.
18.
We would like to thank Dr. D.I. Gilliland and AgriPro for making the data and case study available.

References

Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. Thousand Oaks: Sage.
Google Scholar
Baum, C. F. (2006). An introduction to modern econometrics using Stata. College Station: Stata Press.
Google Scholar
Breusch, T. S., & Pagan, A. R. (1980). The Lagrange multiplier test and its applications to model specification in econometrics. Review of Economic Studies, 47(1), 239–253.
Article Google Scholar
Cameron, A.C. & Trivedi, P.K. (1990). The information matrix test and its implied alternative hypotheses. (Working Papers from California Davis – Institute of Governmental Affairs, pp. 1–33).
Google Scholar
Cameron, A. C., & Trivedi, P. K. (2010). Microeconometrics using stata (Revised ed.). College Station: Stata Press.
Google Scholar
Cohen, J. (1992). A power primer. Psychological Bulletin, 112(1), 155–159.
Article Google Scholar
Cohen, J. (1994). The earth is round (p < .05). The American Psychologist, 49(912), 997–1003.
Article Google Scholar
Cook, R. D., & Weisberg, S. (1983). Diagnostics for heteroscedasticity in regression. Biometrika, 70(1), 1–10.
Article Google Scholar
Durbin, J., & Watson, G. S. (1951). Testing for serial correlation in least squares regression, II. Biometrika, 38(1–2), 159–179.
Article Google Scholar
Fabozzi, F. J., Focardi, S. M., Rachev, S. T., & Arshanapalli, B. G. (2014). The basics of financial econometrics: Tools, concepts, and asset management applications. Hoboken: Wiley.
Book Google Scholar
Green, S. B. (1991). How many subjects does it take to do a regression analysis? Multivariate Behavioral Research, 26(3), 499–510.
Article Google Scholar
Greene, W. H. (2011). Econometric analysis (7^th ed.). Upper Saddle River: Prentice Hall.
Google Scholar
Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2013). Multivariate data analysis. Upper Saddle River: Pearson.
Google Scholar
Hill, C., Griffiths, W., & Lim, G. C. (2008). Principles of econometrics (3^rd ed.). Hoboken: Wiley.
Google Scholar
Kelley, K., & Maxwell, S. E. (2003). Sample size for multiple regression: Obtaining regression coefficients that are accurate, not simply significant. Psychological Methods, 8(3), 305–321.
Google Scholar
Mason, C. H., & Perreault, W. D., Jr. (1991). Collinearity, power, and interpretation of multiple regression analysis. Journal of Marketing Research, 28, 268–280.
Article Google Scholar
Mooi, E. A., & Frambach, R. T. (2009). A stakeholder perspective on buyer–supplier conflict. Journal of Marketing Channels, 16(4), 291–307.
Article Google Scholar
O’brien, R. M. (2007). A caution regarding rules of thumb for variance inflation factors. Quality and Quantity, 41(5), 673–690.
Article Google Scholar
Ramsey, J. B. (1969). Test for specification errors in classical linear least-squares regression analysis. Journal of the Royal Statistical Society, Series B, 31(2), 350–371.
Google Scholar
Sin, C., & White, H. (1996). Information criteria for selecting possibly misspecified parametric models. Journal of Econometrics, 71(1–2), 207–225.
Google Scholar
StataCorp. (2015). Stata 14 base reference manual. College Station: Stata Press.
Google Scholar
Treiman, D. J. (2014). Quantitative data analysis: Doing social research to test ideas. Hoboken: Wiley.
Google Scholar
VanVoorhis, C. R. W., & Morgan, B. L. (2007). Understanding power and rules of thumb for determining sample sizes. Tutorial in Quantitative Methods for Psychology, 3(2), 43–50.
Article Google Scholar
White, H. (1980). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica: Journal of the Econometric Society, 48(4), 817–838.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Management and Marketing, University of Melbourne, Parkville, Victoria, Australia
Erik Mooi
Chair of Marketing, Otto-von-Guericke-University, Magdeburg, Sachsen-Anhalt, Germany
Marko Sarstedt
School of Social and Political Sciences, University of Melbourne, Parkville, Victoria, Australia
Irma Mooi-Reci

Authors

Erik Mooi
View author publications
You can also search for this author in PubMed Google Scholar
Marko Sarstedt
View author publications
You can also search for this author in PubMed Google Scholar
Irma Mooi-Reci
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mooi, E., Sarstedt, M., Mooi-Reci, I. (2018). Regression Analysis. In: Market Research. Springer Texts in Business and Economics. Springer, Singapore. https://doi.org/10.1007/978-981-10-5218-7_7

Download citation

DOI: https://doi.org/10.1007/978-981-10-5218-7_7
Published: 02 November 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-5217-0
Online ISBN: 978-981-10-5218-7
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics