|
1.INTRODUCTIONWith the popularization of mobile phones, it is convenient for users to obtain and share various data about themselves and their surroundings anytime and anywhere, and obtain various intelligent services, which has promoted the formation of LBSN. Mobile phones are widely used in daily lives and works. How to obtain user feature is important to analyse user characteristics and provide intelligent services for them. However, in LBSN, the acquisition of user feature faces huge challenges, mainly in:
Based on the analysis above, a data driven user feature construction and analysis method is proposed in this study. The contributions in this paper including
In this manuscript, Section 2 gives the related works. Section 3 described the proposed method. Section 4 introduces the experimental evaluation, and Section 5 draws a conclusion. 2.RELATED WORKSUser feature construction is an important research direction in LBSNs. These construction methods including two categories: explicit or implicit methods. Explicit methods construct user features through direct communication with users5, which is intuitive and with a high accuracy. However, when the environment around the users is complex, it is difficult for user to describe their features clearly. Implicit methods use multiple analyses methods to construct user features automatically. These methods including four categories: content-based, geographical-based, temporal-based, and social-based feature extraction methods6. The first one constructs user features through related content of users, such as position or comments7. Geographical-based method aims to reveal users’ feature related to distance. Such as naive Bayesian network is used to depict the relationship between the venue that a user has been visited and the distance8. Temporal-based method aims to reveal user’s temporal related feature using data mining or machine learning methods9. Social-based methods assuming that users’ feature may be similar to their friend. Therefore, a set of the features shared by their friends are used to infer current user’s features10. Of course, more than one of these influence factors may combined to describe user features11. Although many works devote to construct users’ features, few of these works pay attention to the applicable of feature models to a user. In this study, in the basis of the construction of multiple user feature models, an algorithm is proposed to distinguish the user set suit to a specific feature model, and a multi-channel CNN is designed to describe the suitable of different users to multiple feature models. 3.METHOD3.1.DefinitionsDefinition 1: UCS USC (User Context Set) is an attributes set to describe the context related to user activities. Definition 2: VS VS (View Set) is a perspectives set that analysts are interested in. Definition 3: User context view feature set (UFS) UFS (User context view Feature Set) is a feature set to describe user’s characteristic generated from user’s check-in data. and: 3.2.Overview of the methodThe overall framework of the method is shown in Figure 1. The method is based on user check-in data. First, the set of view and user context is determined by the analyst, and the effect of the impact of context to the observer’s view is quantify analysed. Secondly, vectored user features are constructed from user check-in data. Thirdly, applicability analysis is conducted through difference value. Finally, a muti-channel CNN is designed to characterize the applicability of users to different features and used to predict user features. 3.3.Information entropy gain-based influence analysisThe cardinal number of the influence combination |ES| is the product of |UCS| and |VS|. When |UCS| and |VS| increases, |ES| will increase rapidly, which is disadvantageous to storage and analysis. And in real life, not all user context set have a significant impact on each view. Therefore, the quantitative analysis of the impact of user context on the view is of great significance. The concept of entropy can be used to describe the degree of chaos within the data, as described in equation (1), n describe number of categories and pi means the probability of belonging to category i. Entropy gain can be used to measure the effectiveness of data partitioning. The information gain and the gain rate are calculated as shown in equations (2) and (3), where A represents the attribute that divides the sample set S, Values represents the value set of the attribute A, v represents a value in the Values set, and Sv represents the sample set corresponding to the value v after the division. Based on the analysis above, this study uses information gain rate to find the UCi that has a significant impact on Vj based on the data set. 3.4.UFS constructionAccording to Definition 3, each UCVFij in UFS is a matrix of |UCi| times |Vj| dimensions, which is constructed as following: first, UCVFij is initialized; second, the check-in data is checked iteratively, the value of v of context value uc is found; third, modify the values in the matrix where the rows and columns corresponding to uc and v. 3.5.Applicability analysisThe process of the applicability analysis is described as following: first of all, a fixed time unit of user feature model is established; and then, difference value is generated for each feature of u; lastly, take the feature of the minimum difference value as the suit one, because when the value is the smallest, it is demonstrated that the user is most suitable for the feature. 3.6.Unified model—muti-channel CNNMultichannel neural networks can effectively describe the local salience features of data, identify and analyse them, and then stack these different channels using a deep structure to support the fusion of multiple salient features. This feature is suitable for describing the user’s adaptability to muti-perspective features; therefore, this study designs a muti-channel convolutional neural network to analyse the user’s personalized features. The basic network structure is illustrated in Figure 2. The network contains |ES| channels, and the input is user set US_UCVFij, which is suitable for the UCVFij model and the matrix UCVFij for these users. After learning the user and the user’s matrix through different channels, the network can predict the user’s activities during a given context. 4.EXPERIMENT3.7.Date sets of experimentWe use two datasets in the experiment to verify the effectiveness of the method, which are labelled as dataset1 and dataset2. The dataset1 is from reference12. The dataset2 was constructed by randomly selecting 5,000 users from the dataset used in reference13. Detailed information of the datasets is shown in Table 1. Table 1.Information of the data sets.
3.8.Research question and evaluation planRQ 1: Is the assumption established that user is applicable to a feature model when the difference value in UCVFij is small? RQ 2: Does applicability analysis help to improve the accuracy to predict user behaviours. RQ 3: Does the unified model improve the prediction accuracy? How does it compare with existing methods? Accuracy of top K is used to evaluate the effectiveness of the method, which is shown in equation (4): where u refers to user, l refers to location, t refers to time, a refers to an activity, and TS stands for test set. 4.EXPERIMENT RESULTS AND DISCUSSIONIn this experiment, UCS = {UCt, UCd}, where t represents time and d represents distance. |UCt|=24 because the time is depicted in hours. |UCd|=4 because the distance is divided into four levels, which are within 1 kilometre, between 1 and 10 kilometres, between 10 and 30 kilometres, and more than 30 kilometres, so. Based on the elements in the data, VS = {Vr, Vc}. The parameter r is a representation of root category, and the parameter c is a representation of category. | Vr | =9 and | Vc | =65 because the number of root categories and categories of POI in the experiment is 9 and 65. Therefore, UFS = {UCVFtime-root category, UCVFtime-category, UCVFdistance-root category, UCVFdistance-category}, which correspond to 24*9 matrix, 24*65 matrix, 4*9 matrix, and 4*65 matrix. After UFS is constructed, the user set suit for each feature is analyzed, as shown in Table 2. Table 2.Users suit to different UCVFij
A multi-channel CNN is designed to depict the adaptability of a UCVFij to a user, and the network is used to predict user activities. Experimental results were presented and discussed as follows. RQ 1: Is the assumption established that user is applicable to a feature model when the difference value in UCVFij is small? The difference value was divided at intervals of 10, and the TOP K accuracy rates was analysed. The results are shown in Figures 3 and 4. These figures show that in both data sets, when the difference value decrease, the accuracy increases, which means the assumption is established. RQ 2: Does applicability analysis help to improve the accuracy to predict user behaviours. To answer this question, UCVFtime-root category, UCVFtime-category, UCVFdistance-root category and UCVFdistance-category were used separately to predict user’s activity, and the accuracy rate is generated. After that, uses the suit UCVFij to describe corresponding users, and the accuracy rate is generated using equation (4), results are depicted in Figure 5. The result shows that in both data sets, the predication of user’s suited UCVFij is of higher to using each UVCFij separately. RQ 3: Does the unified model improve the prediction accuracy? How does it compare with existing methods? • Baseline User activity prediction can be divided into the next prediction and any time prediction in terms of timeliness, coarsegrained and fine-grained predictions in terms of the granularity14. This research belongs to any time and coarse prediction in terms of timeliness and granularity. We adopt the HOSVD method, PFR method, PCLR method and STAP method as baselines for comparison12, 15-17. • Results The proposed method was compared with baseline methods according to Top-K accuracy. The value of K in the Top-K accuracy rate formula is 1 and 5. the results are as follows in Figures 6 and 7. From the graph, we can see that the adaptive analysis method and the muti-channel CNN method outperforms the baseline methods in both dataset1 and dataset2. 5.CONCLUSIONSMultiple user feature was constructed, and whether these features are suit to user is quantitative analysed. Furthermore, a multi-channel CNN is designed to depict user’s applicability to the feature models. The experiments result shows the effectiveness of the method. Although the method is effectiveness in the experiment, it should be further evaluated in a more realistically data and scenarios. In addition, the feature model constructed now is data-sensitive, how to generated a more stable feature of the users in an abstract level is an interesting research direction. ACKNOWLEDGMENTSThis work is supported by the National Natural Science Foundation of China under Grant No. U1504602. REFERENCESBlomberg, J., Burrell, M. and Guest, G.,
“An ethnographic approach to designThe Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies and Emerging Applications,”
Lawrence Erlbaum Associates, Mahwah
(2002). Google Scholar
Anastassova, M., Mégard, C. and Burkhardt, J. M.,
“Prototype evaluation and user-needs analysis in the early design of emerging technologies,”
in Inter. Conf. on Human-computer Interaction: Interaction Design & Usability,
(2007). https://doi.org/10.1007/978-3-540-73105-4 Google Scholar
Xu, T., Ma, Y. and Wang, Q.,
“Cross-urban point-of-interest recommendation for non-natives,”
International Journal of Web Services Research (IJWSR), 15
(3), 82
–102
(2018). https://doi.org/10.4018/IJWSR Google Scholar
Cai, G., Lee, K. and Lee, I.,
“Itinerary recommender system with semantic trajectory pattern mining from geotagged photos,”
Expert Systems with Applications, 94 32
–40
(2018). https://doi.org/10.1016/j.eswa.2017.10.049 Google Scholar
Böhmer, M., Bauer, G. and Krüger, A.,
“Exploring the design space of context-aware recommender systems that suggest mobile applications,”
in 2nd Work. on Context-Aware Recommender Systems,
(2010). Google Scholar
Hess, A., Hummel, K. A., Gansterer, W. N., et al,
“Data-driven human mobility modeling: A survey and engineering guidance for mobile networking,”
ACM Computing Surveys (CSUR), 48
(3), 38
(2016). https://doi.org/10.1145/2840722 Google Scholar
Sun, X., Huang, Z., Peng, X., et al,
“Building a model-based personalised recommendation approach for tourist attractions from geotagged social media data,”
International Journal of Digital Earth, 12
(6), 661
–678
(2019). https://doi.org/10.1080/17538947.2018.1471104 Google Scholar
Wai, K. P. and New, N.,
“Measuring the distance of moving objects from big trajectory data,”
in 2017 IEEE/ACIS 16th Inter. Conf. on Computer and Information Science (ICIS),
137
–142
(2017). Google Scholar
Hsueh, Y. L. and Huang, H. M.,
“Personalized itinerary recommendation with time constraints using GPS datasets,”
Knowledge and Information Systems, 1
–22
(2018). Google Scholar
Ding, Y., Liu, J., Jiang, C., et al,
“A study of friends recommendation algorithm considering users’ preference of making friends in the LBSN,”
Systems Engineering—Theory & Practice,
(11), 22
(2017). Google Scholar
Zhu, Z., Cao, J. and Weng, C.,
“Location-time-sociality aware personalized tourist attraction recommendation in LBSN,”
in 2018 IEEE 22nd Inter. Conf. on Computer Supported Cooperative Work in Design (CSCWD),
636
–641
(2018). Google Scholar
Yang, D., Zhang, D., Zheng, V. W., et al,
“Modeling user activity preference by leveraging user spatial temporal characteristics in LBSNs,”
IEEE Transactions on Systems, Man, and Cybernetics: Systems, 45
(1), 129
–142
(2015). https://doi.org/10.1109/TSMC.2014.2327053 Google Scholar
Yang, D., Zhang, D. and Qu, B.,
“Participatory cultural mapping based on collective behavior data in location-based social networks,”
ACM Transactions on Intelligent Systems and Technology (TIST), 7
(3), 1
–23
(2016). https://doi.org/10.1145/2814575 Google Scholar
Xu, S., Fu, X., Cao, J., et al,
“Survey on user location prediction based on geo-social networking data,”
World Wide Web, 23 1621
–1664
(2020). https://doi.org/10.1007/s11280-019-00777-8 Google Scholar
Yang, D., Zhang, D., Zheng, V. W., Yu, Z. and Wang, Z.,
“A sentiment-enhanced personalized location recommendation system,”
in Proc. HT,
119
–128
(2013). Google Scholar
Lathauwer, L. D., Moor, B. D. and Vandewalle, J.,
“Multilinear singular value tensor decompositions,”
SIAM Journal on Matrix Analysis and Applications, 24
(4), 1253
–1278
(2000). https://doi.org/10.1137/S0895479896305696 Google Scholar
Rahimi, S. M. and Wang, X.,
“Location recommendation based on periodicity of human activities and location categoriesLecture Notes in Computer Science,”
Pacific-Asia Conf. on Knowledge Discovery and Data Mining, 7819 Springer, Berlin
(2013). Google Scholar
|