Skip to main content

Keyword Extraction from Hindi Documents Using Statistical Approach

  • Conference paper
  • First Online:
Intelligent Computing, Communication and Devices

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 309))

Abstract

Keywords of a document give us an idea about its important points without going through the whole text. In this paper, we propose an unsupervised, domain-independent, and corpus-independent approach for automatic keyword extraction. The approach is general and can be applied to any language. However, we have tested the approach on Hindi language. Our approach combines the information contained in frequency and spatial distribution of a word in order to extract keywords from a document. Our work is specially significant in the light that it has been implemented and tested on Hindi which is a resource poor and underrepresented language.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Salton, G., Buckley, C.: Weighting approaches in automatic text retrieval. Inf. Process. Manage. 24(5), 513–523 (1988)

    Article  Google Scholar 

  2. Luhn, H.P.: A statistical approach to mechanized encoding and searching of literary information. IBM J. Res. Dev. 1(4), 309–317 (1957)

    Article  MathSciNet  Google Scholar 

  3. Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Documentation 28(1), 11–21 (1972)

    Article  Google Scholar 

  4. Herrera, J.P., Pury, P.A.: Statistical keyword detection in literary corpora. Eur. Phys. J. B. 63(1), 135–146 (2008)

    Google Scholar 

  5. Ortuño, M., Carpena, P., Bernaola-Galván, P., Muñoz, E., Somoza, A.M.: Keyword detection in natural languages and DNA. Europhys. Lett. 57, 759–764 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aditi Sharan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer India

About this paper

Cite this paper

Sharan, A., Siddiqi, S., Singh, J. (2015). Keyword Extraction from Hindi Documents Using Statistical Approach. In: Jain, L., Patnaik, S., Ichalkaranje, N. (eds) Intelligent Computing, Communication and Devices. Advances in Intelligent Systems and Computing, vol 309. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2009-1_57

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-2009-1_57

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-2008-4

  • Online ISBN: 978-81-322-2009-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics