Sains Ma1aysiana 25(1): 145-160 (1996)                                                                           Sains Matematik/

                                                                                                                                              Mathematical Sciences

 

Detection of Influential Observations in

Principle Component Regression

 

 

Mokhtar b. Abdullah

Jabatan Statistik

Fakulti Sains Matematik

Universiti Kebangsaan Malaysia

43600 UKM Bangi Selangor D.E. Malaysia

 

 

ABSTRACT

 

Multicollinearity that may exist among explanatory variables in a regression model can make the regression coefficients insignificant and difficult to interpret. Principal component regression (PCR) is an effective way for solving multicollinearity in regression analysis. The existence of multicollinearity mayor may not be induced by the presence of influential observations. This paper discusses some diagnostic methods for identifying influential observations in the PCR. A data set on water quality of New York Rivers was considered to illustrate the methods. 

 

ABSTRAK

 

Multikolinearan yang wujud di kalangan pembolehubah penerang dalam model regresi boleh menyebabkan pekali regresi tidak bererti dan sukar untuk ditafsirkan. Regresi komponen utama (PCR) merupakan cara yang berkesan bagi menyelesaikan masalah multikolinearan dalam analisis regresi. Kewujudan multikolinearan mungkin disebabkan oleh data terpencil yang berpengaruh. Kertas ini membincangkan beberapa kaedah pengecaman bagi mengenalpasti data berpengaruh dalam PCR. Data tentang kualiti air di beberapa batang sungai di New York digunakan untuk memperihalkan kaedah pengecaman yang disarankan.

 

 

RUJUKAN/REFERENCES

 

Belsley, D. A, Kuh, E., & Welsch. R. E. 1980. Regression Diagnostics. New York: Wiley.

Campbell, N. A 1980. Robust Procedure in Multivariate Analysis I: Robust Covariance Estimation. Appl. Stat. 29: 231-237.

Cook, R. D. 1977. Detection of Influential Observations in Linear Regression. Technometrics 19: 15-18.

Devlin, S. J., Gnanadesikan, R., & Kettenring, J. R. 1981. Robust estimation of dispersion matrices and principal components. Journal of the American Statistical Association. 76: 354-362.

Hadi, A S. 1992. A new measure of overall potential influence in linear regression, Computational Statistics & Data Analysis 14: 1-27.

Hampel, F. R. 1974. The Influence Curve and Its Role in Robust Estimation, Journal of the American Statistical Association. 69: 383-393.

Hoaglin, D. C. & Welsch, R. 1978. The Hat Matrix in Regression and ANOVA, The American Statistician 32: 17-22.

Maronna, R. A 1976. Robust M-estimators of multivariate location and scatter, Annals of Statistics 4: 51-67.

 

 

 

previous