Integration of the PageRank Algorithm, Sequence Processing, and CPT+ for Webpage Access Prediction
Nguyen Thon Da1, Tan Hanh, Pham Hoang Duy2

1Nguyen Thon Da*, Department of Information Systems, University of Economics and Law, VNU-HCM, HCM City, Vietnam, Asia.
2Tan Hanh, Department of Information Technology, Posts and Telecommunications Institute of Technology, Hanoi, Vietnam, Asia.
3Pham Hoang Duy, Department of Information Technology, Posts and Telecommunications Institute of Technology, Hanoi, Vietnam, Asia.
Manuscript received on March 16, 2020. | Revised Manuscript received on March 24, 2020. | Manuscript published on March 30, 2020. | PP: 2327-2335 | Volume-8 Issue-6, March 2020. | Retrieval Number: F8209038620/2020©BEIESP | DOI: 10.35940/ijrte.F8209.038620

Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: In this article, we provide a novel model to address the issue of webpage access prediction. In particular, the main approach we propose aims to reduce execution time by reducing the sequence space. This solution combines calculation of PageRank values of sequences in sequence databases and analysis of sequences from these shortened sequence databases. To evaluate the solution, we chose K-fold validation with K = 10 by randomizing the dataset 10 times; then the system calculated the average PageRank values of sequences. Next, with acceptable accuracy (when the size of datasets was reduced by up to 30% by PageRank calculation), we performed next access page prediction by analysing 1000 sequences. Experimental results for the real FIFA dataset show that our new proposed approach is much better than previous approaches in terms of prediction execution time.
Keywords: Webpage Access Prediction, Sequence Prediction, CPT+, Pagerank Algorithm.
Scope of the Article: VLSI Algorithms.