technical-note

Parallel latent semantic analysis using a graphics processing unit

Authors:
Joseph M. Cavanagh

University of Minnesota - Morris, Morris, MN, USA

University of Minnesota - Morris, Morris, MN, USA
View Profile

,
Thomas E. Potok

Oak Ridge National Laboratory, Oak Ridge, TN, USA

Oak Ridge National Laboratory, Oak Ridge, TN, USA
View Profile

,
Xiaohui Cui

Oak Ridge National Laboratory, Oak Ridge, TN, USA

Oak Ridge National Laboratory, Oak Ridge, TN, USA
View Profile

GECCO '09: Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking PapersJuly 2009Pages 2505–2510https://doi.org/10.1145/1570256.1570352

Published:08 July 2009Publication History

GECCO '09: Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers

Pages 2505–2510

ABSTRACT

Latent Semantic Analysis (LSA) can be used to reduce the dimensions of large Term-Document datasets using Singular Value Decomposition. However, with the ever expanding size of data sets, current implementations are not fast enough to quickly and easily compute the results on a standard PC. The Graphics Processing Unit (GPU) can solve some highly parallel problems much faster than the traditional sequential processor (CPU). Thus, a deployable system using a GPU to speedup large-scale LSA processes would be a much more effective choice (in terms of cost/performance ratio) than using a computer cluster. In this paper, we presented a parallel LSA implementation on the GPU, using NVIDIA R Compute Unified Device Architecture (CUDA) and Compute Unified Basic Linear Algebra Subprograms (CUBLAS). The performance of this implementation is compared to traditional LSA implementation on CPU using an optimized Basic Linear Algebra Subprograms library. For large matrices that have dimensions divisible by 16, the GPU algorithm ran five to six times faster than the CPU version.

References

N. Adams, G. Blunt, D. Hand, and M. Kelly. Data mining for fun and profit. Statistical Science, 15(2):111--131, 2000.Google ScholarCross Ref
M. Berry. Large-scale sparse singular value computations. The International Journal of Supercomputer Applications, 6(1):13--49, 1992.Google ScholarDigital Library
S. Dumais, G. Furnas, T. Lanerwester, R. Harshmandauer, S. Deerwester, and R. Harshman. Using latent semantic analyses to improve access to textual information. In Proceedings of the SIGCHI conference on Human factors in computing systems, Washington, D.C., United States, May 1988. Google ScholarDigital Library
N. Galoppo, N. Govindaraju, M. Henson, and D. Manocha. E±cient algorithms for solving dense linear systems on graphics hardware. In Proceedings of the 2005 Coordinated and Multiple Views in Exploratory Visualization Conference, Washington, D.C., United States, March 2005.Google Scholar
H.-P. Kersken and U. Kuster. A parallel lanczos algorithm for eigensystem calculation. Technical Report 310, University of Stuttgart, 1999.Google Scholar
C. Lanczos. An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J. Res. Natl. Bureau Stand., 45(1):255--282, 1950.Google ScholarCross Ref
S. Manavski and G. Valle. Cuda compatible gpu cards as efficient hardware accelerators for smith-waterman sequence alignment. BMC Bioinformatics, 9(2), 2008.Google Scholar
Nvidia. Cuda:compute unied device architecture. Technical Report 2, NVIDIA, 2008.Google Scholar
J. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Kruger, A. Lefohn, and T. Purcell. A survey of general-purpose computation on graphics hardware. Computer Graphics Forum, 26(1):80--113, 2007.Google ScholarCross Ref
C. Paige, B. Parlett, and H. V. der Vorst. Approximate solutions and eigenvalue bounds from krylov subspaces. Numerical Linear Algebra with Applications, 2(2):115--134, 1995.Google ScholarCross Ref
B. Parlett and D. Scott. The lanczos algorithm with selective orthogonalization. Mathematics of Computation, 33(145):217--238, 1979.Google ScholarCross Ref
P. Robert, S. Schoepke, and H. Bieri. Hybrid ray tracing -- ray tracing using gpu-accelerated image-space methods. In Proceedings of the 2007 International Conference on Computer Graphics Theory, pages 305--311, Barcelona, Spain, 2007.Google Scholar
G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5):513--523, 1988. Google ScholarDigital Library
H. Simon. The lanczos algorithm with partial reorthogonalization. Mathematics of Computation, 42(165):115--142, 1984.Google ScholarCross Ref

Index Terms

Parallel latent semantic analysis using a graphics processing unit

Recommendations

Parallel multi-level analytical global placement on graphics processing units
ICCAD '09: Proceedings of the 2009 International Conference on Computer-Aided Design

GPU platforms are becoming increasingly attractive for implementing accelerators because they feature a larger number of cores with improved programmability. In this paper, we describe our implementation of a state-of-the-art academic multi-level ...
Read More
Graphics processing unit accelerated phase field dislocation dynamics

High-performance computing implementation of phase field dislocation dynamics is developed.The implementation is based on OpenACC and compute unified device architecture FFT library.Interaction and motion of dislocations through a bi-phase CuNi ...
Read More
N-body computations using skeletal frameworks on multicore CPU/graphics processing unit architectures: an empirical performance evaluation

With the emergence of general-purpose computation on graphics processing units, high-level approaches that hide the conceptual complexity of the low-level Compute Unified Device Architecture and Open Computing Language platforms are the subject of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
GECCO '09: Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers
July 2009
1760 pages
ISBN:9781605585055
DOI:10.1145/1570256
General Chair:
Franz Rothlauf
University of Mainz, Germany
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 July 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
gpu
latent semantic indexing
text mining
Qualifiers
- technical-note
Conference

Acceptance Rates
Overall Acceptance Rate1,669of4,410submissions,38%
Upcoming Conference
GECCO '24

Sponsor:

sigevo

Genetic and Evolutionary Computation Conference

July 14 - 18, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 12
  Total Citations
  View Citations
- 296
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Parallel latent semantic analysis using a graphics processing unit

GECCO '09: Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers

ABSTRACT

References

Cited By

Index Terms

Recommendations

Parallel multi-level analytical global placement on graphics processing units

Graphics processing unit accelerated phase field dislocation dynamics

N-body computations using skeletal frameworks on multicore CPU/graphics processing unit architectures: an empirical performance evaluation