A Rule-Based Approach for Extraction of Link-Context from Anchor-Text Structure

Kumar, Suresh; Kumar, Naresh; Singh, Manjeet; De, Asok

doi:10.1007/978-3-642-32063-7_28

Suresh Kumar³,
Naresh Kumar⁴,
Manjeet Singh⁵ &
…
Asok De³

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 182))

1781 Accesses
7 Citations

Abstract

Most of the researchers have widely explored the use of link-context to determine the theme of target web-page. Link-context has been applied in areas such as search engines, focused crawlers, and automatic classification. Therefore, extraction of precise link-context may be considered as an important parameter for extracting more relevant information from the web-page. In this paper, we have proposed a rule-based approach for the extraction of the link-context from anchor-text (AT) structure using bottom-up simple LR (SLR) parser. Here, we have considered only named entity (NE) anchor-text. In order to validate our proposed approach, we have considered a sample of 4 ATs. The results have shown that, the proposed LCEA has extracted 100% actual link-context of each considered AT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Jing, T., Ping, T., Zuo, W.: Deriving Link Context through Dependency Analysis. In: IEEE International Conference on Education Technology and Computer (2009)
Google Scholar
Java, A., et al.: Using a Natural Language Understanding System to Generate Semantic Web Content. International Journal on Semantic Web and Information Systems 3(4) (2007)
Google Scholar
Chauhan, N., Sharma, A.K.: Analyzing Anchor- Links to Extract Semantic Inference of a Web page. In: 10th IEEE International Conference on Information Technology (2007)
Google Scholar
Xu, Q., Zuo, W.: Extracting Precise Link Context Using NLP Parsing Technique. In: Proceeding of the IEEE/WIC/ACM International Conference on Web Intelligence, WI 2004 (2004)
Google Scholar
Pant, G.: Deriving Link-context from HTML Tag Tree. In: Proceedings of 8th SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (2003)
Google Scholar
Henzinger, M., et al.: Link Analysis in Web Information Retrieval. IEEE Data Engineering Bulletin 23(3), 3–8 (2000)
Google Scholar
Aho, A.V., Ullman, J.D.: Principals of Compiler Design, pp. 197–214. Narosa Publishing House (25th reprint 2003)
Google Scholar
Fensal, D., Van Harmelen, Horrocks, I., McGuinness, Patel-Scheider: OIL: An ontology Infrastructure for the Semantic Web. IEEE Intelligent Systems 16(2), 38–45 (2001)
Article Google Scholar
Klein, M.: Tutorial: The Semantic Web- XML, RDF, and Relatives. IEEE Intelligent Systems 16(2), 26–28 (2001)
Article Google Scholar
Hebeler, J., Fisher, M., Blace, R., Lopez, A.P.: Semantic Web Programming, pp. 63–139. Wiley Publication (2009)
Google Scholar
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30(1-7), 107–117 (1998)
Article Google Scholar
Aggarwal, C.C., AI-Garawi, F., Yu, P.S.: Intelligent crawling on the World Wide Web with arbitrary predicates. In: WWW 10, Hong Kong (May 2001)
Google Scholar
Chauhan, N., Sharma, A.K.: A framework to derive web page context from hyperlink structure. International Journal of Information and Communication Technology 1(3/4), 329–346
Google Scholar
Attardi, G., Gulli, A., Sebastini, F.: Automatic Web page categorization by link and context analysis. In: Proceeding of THAI 1999, 1st European Symposium on Telematics, Hypermedia and Artificial Intelligence (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Advanced Communication Technologies & Research, Delhi, 31, India
Suresh Kumar & Asok De
AIIT, Amity University, Noida, India
Naresh Kumar
YMCA University of Science & Technology, Faridabad, India
Manjeet Singh

Authors

Suresh Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Naresh Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Manjeet Singh
View author publications
You can also search for this author in PubMed Google Scholar
Asok De
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Suresh Kumar .

Editor information

Editors and Affiliations

(MIR Labs), Scientific Network for Innovation and, Machine Intelligence Research Labs, MIR Labs Campus, Auburn, 98071, Washington, USA
Ajith Abraham
Technology and Management, Indian Institute of Information, Technopark Campus, Trivandrum, 695581, India
Sabu M Thampi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kumar, S., Kumar, N., Singh, M., De, A. (2013). A Rule-Based Approach for Extraction of Link-Context from Anchor-Text Structure. In: Abraham, A., Thampi, S. (eds) Intelligent Informatics. Advances in Intelligent Systems and Computing, vol 182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32063-7_28

Download citation

DOI: https://doi.org/10.1007/978-3-642-32063-7_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32062-0
Online ISBN: 978-3-642-32063-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics