Abstract
During web search, we often end up with untrusted, duplicates and near duplicate search results which dilutes the focus of search query. Factors that may influence the trust of web search results shall be referred to as ’Provenance’. Provenance is basically the information about the history of data. In this paper, we propose a provenance model which uses both content based and trust based factors in identifying trusted search results. The novelty of our idea lies in attempting to construct a provenance matrix which encompasses 6 factors (who, where, when, what, why, how) related to the search results. Inferences performed over the provenance matrix leads to trust score which is then utilized to remove near-duplicates and retrieve trusted search results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Singh, B., Singh, H.K.: Web Data Mining Research: A Survey. In: Computational Intelligence and Computing Research (ICCIC), pp. 1–10 (2010)
Hartig, O.: Provenance Information in the Web of Data. In: Proceedings of the Linked Data on the Web (LDOW) Workshop at the World Wide Web Conference (WWW), Madrid, Spain, pp. 1–7 (April 2009)
Ma, Q., Miyamori, H., Kidawara, Y., Tanaka, K.: Content-coverage Based Trust-oriented Evaluation Method for Information Retrieval. In: Proceedings of the Second International Conference on Semantics, Knowledge, and Grid (SKG 2006), pp. 22–26 (2006)
Li, X., Yang, Q., Zeng, L.: Clustering Web Retrieval Results Accompanied by Removing Duplicate Documents. In: 2010 International Conference on Web Information Systems and Mining, pp. 259–261 (2010)
Bollegala, D., Matsuo, Y., Ishizuka, M.: A Web Search Engine-Based Approach to Measure Semantic Similarity between Words. IEEE Transactions on Knowledge and Data Engineering 23, 977–990 (2011)
Anderson, N.: Putting Search in Context: Using Dynamically-Weighted Information Fusion to Improve Search Results. In: 2011 Eighth International Conference on Information Technology, pp. 66–71 (2011)
Pandey, S.K., Mishra, R.B.: Intelligent Web Mining Model to Enhance Knowledge Discovery on the Web. In: Proceedings of the Seventh International Conference on Parallel and Distributed Computing, Applications and Technologies, pp. 339–343 (2006)
Taylan, D., Poyraz, M., Akyokuş, S., Ganiz, M.C.: Intelligent Focused Crawler: Learning which Links to crawl, pp. 504–508. IEEE (2011)
Tanaka, K.: Knowledge Search and Trust-oriented Search. In: International Conference on Informatics Education and Research for Knowledge-Circulating Society, pp. 81–86 (2008)
Huang, C., Chen, Y., Wang, W., Cui, Y., Wang, H., Du, N.: A novel social search model based on trust and popularity. In: Proceedings of IC-BNMT, pp. 1030–1034 (2010)
Vasquez, I., Gomadam, K., Patterson, S.: Data Provenance in next-gen information systems: Adding, extracting and analyzing information in the Web services domain (2008)
Syed Mudhasir, Y., Deepika, J., Sendhilkumar, S., Mahalakshmi, G.S.: Near-Duplicates De-tection and Elimination Based on Web Provenance for Effective Web Search. (IJIDCS) International Journal on Internet and Distributed Computing Systems 1(1), 22–32 (2011)
Subhashini, R., Akilandeswari, J.: A Survey On Ontology Construction Methodologies. International Journal of Enterprise Computing and Business Systems 1(1), 60–72 (2011)
Biryukov, M., Wang, Y.: Classification of Personal Names with Application to DBLP. In: Third International Conference on Digital Information Management (ICDIM), pp. 131–137 (2008)
Beel, J., Gipp, B.: Google Scholar’s ranking algorithm: The impact of citation counts (An empirical study). In: Third International Conference on Research Challenges in Information Science (RCIS), pp. 439–446 (2009)
Poomagal, S., Hamsapriya, T.: K-Means for Search Results Clustering Using URL and Tag Contents. In: International Conference on Process Automation, Control and Computing (PACC), pp. 1–7 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Robert, A., Sendhilkumar, S. (2013). Provenance Based Web Search. In: Abraham, A., Thampi, S. (eds) Intelligent Informatics. Advances in Intelligent Systems and Computing, vol 182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32063-7_48
Download citation
DOI: https://doi.org/10.1007/978-3-642-32063-7_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32062-0
Online ISBN: 978-3-642-32063-7
eBook Packages: EngineeringEngineering (R0)