ABSTRACT
Dimension reduction is critical for many database and data mining applications, such as efficient storage and retrieval of high-dimensional data. In the literature, a well-known dimension reduction scheme is Linear Discriminant Analysis (LDA). The common aspect of previously proposed LDA based algorithms is the use of Singular Value Decomposition (SVD). Due to the difficulty of designing an incremental solution for the eigenvalue problem on the product of scatter matrices in LDA, there is little work on designing incremental LDA algorithms. In this paper, we propose an LDA based incremental dimension reduction algorithm, called IDR/QR, which applies QR Decomposition rather than SVD. Unlike other LDA based algorithms, this algorithm does not require the whole data matrix in main memory. This is desirable for large data sets. More importantly, with the insertion of new data items, the IDR/QR algorithm can constrain the computational cost by applying efficient QR-updating techniques. Finally, we evaluate the effectiveness of the IDR/QR algorithm in terms of classification accuracy on the reduced dimensional space. Our experiments on several real-world data sets reveal that the accuracy achieved by the IDR/QR algorithm is very close to the best possible accuracy achieved by other LDA based algorithms. However, the IDR/QR algorithm has much less computational cost, especially when new data items are dynamically inserted.
- P.N. Belhumeour, J.P. Hespanha, and D.J. Kriegman. Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection. IEEE Trans. Pattern Analysis and Machine Intelligence, 19(7):711--720, 1997. Google ScholarDigital Library
- C. Bohm, S. Berchtold, and D. A. Keim. Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Computing Surveys, 33(3):322--373, 2001. Google ScholarDigital Library
- S. Chakrabarti, S. Roy, and M. Soundalgekar. Fast and accurate text classification via multiple linear discriminant projections. In VLDB, pages 658--669, Hong Kong, 2002. Google ScholarDigital Library
- S. Chandrasekaran, B. S. Manjunath, Y. F. Wang, J. Winkeler, and H. Zhang. An eigenspace update algorithm for image analysis. Graphical Models and Image Processing: GMIP, 59(5):321--332, 1997. Google ScholarDigital Library
- C. Chatterjee and V. P. Roychowdhury. On self-organizing algorithms and networks for class-separability features. IEEE Trans. Neural Networks, 8(3):663--678, 1997. Google ScholarDigital Library
- J.W. Daniel, W. B. Gragg, L. Kaufman, and G. W. Stewart. Reorthogonalization and stable algorithms for updating the gram-schmidt QR factorization. Mathematics of Computation, 30:772--795, 1976.Google Scholar
- R.O. Duda, P.E. Hart, and D. Stork. Pattern Classification. Wiley, 2000. Google ScholarDigital Library
- J. H. Friedman. Regularized discriminant analysis. Journal of the American Statistical Association, 84(405):165--175, 1989.Google ScholarCross Ref
- K. Fukunaga. Introduction to Statistical Pattern Classification. Academic Press, USA, 1990. Google ScholarDigital Library
- G. H. Golub and C. F. Van Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, MD, USA, third edition, 1996. Google ScholarDigital Library
- P. Hall, D. Marshall, and R. Martin. Merging and splitting eigenspace models. IEEE Trans. Pattern Analysis and Machine Intelligence, 22(9):1042--1049, 2000. Google ScholarDigital Library
- P. Howland, M. Jeon, and H. Park. Structure preserving dimension reduction for clustered text data based on the generalized singular value decomposition. SIAM Journal on Matrix Analysis and Applications, 25(1):165--179, 2003. Google ScholarDigital Library
- I. T. Jolliffe. Principal Component Analysis. Springer-Verlag, New York, 1986.Google ScholarCross Ref
- K. V. Ravi Kanth, D.t Agrawal, A. E. Abbadi, and A. Singh. Dimensionality reduction for similarity searching in dynamic databases. Computer Vision and Image Understanding: CVIU, 75(1--2):59--72, 1999. Google ScholarDigital Library
- W.J. Krzanowski, P. Jonathan, W.V McCarthy, and M.R. Thomas. Discriminant analysis with singular covariance matrices: methods and applications to spectroscopic data. Applied Statistics, 44:101--115, 1995.Google ScholarCross Ref
- J. Mao and K. Jain. Artificial neural networks for feature extraction and multivariate data projection. IEEE Trans. Neural Networks, 6(2):296--317, 1995. Google ScholarDigital Library
- A. Martinez and A. Kak. PCA versus LDA. In IEEE Trans. Pattern Analysis and Machine Intelligence, volume 23, pages 228--233, 2001. Google ScholarDigital Library
- A.M. Martinez and R. Benavente. The AR face database. Technical Report No. 24, 1998.Google Scholar
- H. Park, M. Jeon, and J.B. Rosen. Lower dimensional representation of text data based on centroids and least squares. BIT, 43(2):1--22, 2003.Google ScholarCross Ref
- R. Polikar, L. Udpa, S. Udpa, and V. Honavar. Learn++: An incremental learning algorithm for supervised neural networks. IEEE Trans. Systems, Man, and Cybernetics, 31:497--508, 2001. Google ScholarDigital Library
- D. L. Swets and J.Y. Weng. Using discriminant eigenfeatures for image retrieval. IEEE Trans. Pattern Analysis and Machine Intelligence, 18(8):831--836, 1996. Google ScholarDigital Library
- F.D.L. Torre and M. Black. Robust principal component analysis for computer vision. In ICCV, volume I, pages 362--369, 2001.Google Scholar
- J. Ye, R. Janardan, C.H. Park, and H. Park. An optimization criterion for generalized discriminant analysis on undersampled problems. IEEE Trans. Pattern Analysis and Machine Intelligence, 26(8):982--994, 2004. Google ScholarDigital Library
Index Terms
- IDR/QR: an incremental dimension reduction algorithm via QR decomposition
Recommendations
IDR/QR: An Incremental Dimension Reduction Algorithm via QR Decomposition
Dimension reduction is a critical data preprocessing step for many database and data mining applications, such as efficient storage and retrieval of high-dimensional data. In the literature, a well-known dimension reduction algorithm is Linear ...
A Two-Stage Linear Discriminant Analysis via QR-Decomposition
Linear Discriminant Analysis (LDA) is a well-known method for feature extraction and dimension reduction. It has been used widely in many applications involving high-dimensional data, such as image and text classification. An intrinsic limitation of ...
Incremental learning from chunk data for IDR/QR
IDR/QR, which is an incremental dimension reduction algorithm based on linear discriminant analysis (LDA) and QR decomposition, has been successfully employed for feature extraction and incremental learning. IDR/QR can update the discriminant vectors ...
Comments