Skip to main content
Log in

Solving the data sparsity problem in destination prediction

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Destination prediction is an essential task for many emerging location-based applications such as recommending sightseeing places and targeted advertising according to destinations. A common approach to destination prediction is to derive the probability of a location being the destination based on historical trajectories. However, almost all the existing techniques use various kinds of extra information such as road network, proprietary travel planner, statistics requested from government, and personal driving habits. Such extra information, in most circumstances, is unavailable or very costly to obtain. Thereby we approach the task of destination prediction by using only historical trajectory dataset. However, this approach encounters the “data sparsity problem”, i.e., the available historical trajectories are far from enough to cover all possible query trajectories, which considerably limits the number of query trajectories that can obtain predicted destinations. We propose a novel method named Sub-Trajectory Synthesis (SubSyn) to address the data sparsity problem. SubSyn first decomposes historical trajectories into sub-trajectories comprising two adjacent locations, and then connects the sub-trajectories into “synthesised” trajectories. This process effectively expands the historical trajectory dataset to contain much more trajectories. Experiments based on real datasets show that SubSyn can predict destinations for up to ten times more query trajectories than a baseline prediction algorithm. Furthermore, the running time of the SubSyn-training algorithm is almost negligible for a large set of 1.9 million trajectories, and the SubSyn-prediction algorithm runs over two orders of magnitude faster than the baseline prediction algorithm constantly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

Notes

  1. The demonstration system can be accessed following this link: http://spatialanalytics.cis.unimelb.edu.au/subsyndemo/.

  2. Note the difference between \(p_{i\rightarrow j}\) and \(p_{ij}\). The latter (without the arrow) is the transition probability defined in Markov model, and its definition was given in Eq. (4).

  3. Consecutive cells are two cells next to each other in a trajectory; adjacent cells are two cells next to each other in a grid.

References

  1. Ali, M.E., Zhang, R., Tanin, E., Kulik, L.: A motion-aware approach to continuous retrieval of 3d objects. In: Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on, IEEE, pp. 843–852 (2008)

  2. Ali, M.E., Tanin, E., Zhang, R., Kulik, L.: A motion-aware approach for efficient evaluation of continuous queries on 3d object databases. VLDB J. Int. J. Very Large Data Bases 19(5), 603–632 (2010)

    Article  Google Scholar 

  3. Alvarez-Garcia, J.A., Ortega, J.A., Gonzalez-Abril, L., Velasco, F.: Trip destination prediction based on past GPS log using a hidden markov model. Expert Syst. Appl. Int. J. 37, 8166–8171 (2010)

    Article  Google Scholar 

  4. Alvarez-Lozano, J., García-Macías, J.A., Chávez, E.: Learning and user adaptation in location forecasting. In: Proceedings of the 2013 ACM Conference on Pervasive and Ubiquitous Computing Adjunct Publication, UbiComp ’13 Adjunct, pp. 461–470 (2013)

  5. Ashbrook, D., Starner, T.: Using GPS to learn significant locations and predict movement across multiple users. Pers. Ubiquitous Comput. 7, 275–286 (2003)

    Article  Google Scholar 

  6. Bhattacharya, A., Das, S.K.: Lezi-update: an information-theoretic approach to track mobile users in pcs networks. In: MobiCom, pp. 1–12 (1999)

  7. Chen, L., Lv, M., Chen, G.: A system for destination and future route prediction based on trajectory mining. Pervasive Mob. Comput. 6, 657–676 (2010)

    Article  Google Scholar 

  8. Gogate, V., Dechter, R., Bidyuk, B.: Modeling transportation routines using hybrid dynamic mixed networks. In: UAI, pp. 217–224 (2005)

  9. GPSExchange: GPS track log route exchange forum. http://www.gpsexchange.com/ (2012)

  10. Hashem, T., Kulik, L., Zhang, R.: Privacy preserving group nearest neighbor queries. In: EDBT, pp. 489–500 (2010)

  11. Hashem, T., Kulik, L., Zhang, R.: Countering overlapping rectangle privacy attack for moving knn queries. Inf. Syst. 38(3), 430–453 (2013)

    Article  Google Scholar 

  12. Horvitz, E., Krumm, J.: Some help on the way: opportunistic routing under uncertainty. In: UbiComp, pp. 371–380 (2012)

  13. Jensen, C.S., Lin, D., Ooi, B.C., Zhang, R.: Effective density queries on continuouslymoving objects. In: Data Engineering, 2006. ICDE’06. Proceedings of the 22nd International Conference on, IEEE, pp. 71–71 (2006)

  14. Jensen, J.: Sur les fonctions convexes et les ingalits entre les valeurs moyennes. Acta Mathematica 30(1), pp. 175–193. doi:10.1007/BF02418571, http://dx.doi.org/10.1007/BF02418571 (1906)

  15. Krumm, J., Horvitz, E.: Predestination: inferring destinations from partial trajectories. In: UbiComp, pp. 243–260 (2006)

  16. Krumm, J., Horvitz, E.: Predestination: Where do you want to go today? IEEE Comput. 40(4), 105–107 (2007)

    Article  Google Scholar 

  17. Liao, L., Patterson, D.J., Fox, D., Kautz, H.: Learning and inferring transportation routines. Artif. Intell. 171(5–6), 311–331 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  18. Marmasse, N., Schmandt, C.: A user-centered location model. Pers. Ubiquitous Comput. 6, 318–321 (2002)

  19. Microsoft Research: T-drive trajectory data sample. http://research.microsoft.com/apps/pubs/?id=152883 (2012)

  20. Nutanong, S., Zhang, R., Tanin, E., Kulik, L.: Analysis and evaluation of V*- knn: an efficient algorithm for moving knn queries. VLDB J. 19(3), 307–332 (2010)

  21. Patterson, D.J., Liao, L., Fox, D., Kautz, H.: Inferring high-level behavior from low-level sensors. In: UbiComp, pp. 73–89 (2003)

  22. Qiu, D., Papotti, P., Blanco, L.: Future locations prediction with uncertain data. Machine Learning and Knowledge Discovery in Databases, vol. 8188, pp. 417–432. Springer, Berlin Heidelberg (2013)

  23. ShareMyRoute: Share my route. http://www.sharemyroutes.com (2012)

  24. Simmons, R., Browning, B., Zhang, Y., Sadekar, V.: Learning to predict driver route and destination intent. In: ITSC, pp. 127–132 (2006)

  25. Strassen, V.: Gaussian elimination is not optimal. Numerische Mathematik 13(4), 354–356 (1969)

    Article  MATH  MathSciNet  Google Scholar 

  26. Tiesyte, D., Jensen, C.S.: Similarity-based prediction of travel times for vehicles traveling on known routes. GIS, pp. 14:1–14:10 (2008)

  27. Williams, V.V.: Breaking the coppersmith-winograd barrier (2011, unpublished)

  28. Xue, A.Y., Zhang, R., Zheng, Y., Xie, X., Huang, J., Xu, Z.: Destination prediction by sub-trajectory synthesis and privacy protection against such prediction. In: ICDE (2013)

  29. Xue, A.Y., Zhang, R., Zheng, Y., Xie, X., Yu, J., Tang, Y.: Desteller: A system for destination prediction based on trajectories with privacy protection. In: International Conference on Very Large Data Bases (VLDB) (2013)

  30. Yuan, J., Zheng, Y., Zhang, C., Xie, W., Xie, X., Sun, G., Huang, Y.: T-drive: Driving directions based on taxi trajectories. In: GIS, pp. 99–108 (2010)

  31. Yuan, J., Zheng, Y., Xie, X., Sun, G.: Driving with knowledge from the physical world. In: KDD, pp. 316–324 (2011)

  32. Zhang, R., Qi, J., Lin, D., Wang, W., Wong, R.C.W.: A highly optimized algorithm for continuous intersection join queries over moving objects. VLDB J. Int. J. Very Large Data Bases 21(4), 561–586 (2012)

    Article  Google Scholar 

  33. Zheng, Y., Zhou, X. (eds.): Computing with Spatial Trajectories. Springer, Berlin (2011)

  34. Ziebart, B.D., Maas, A.L., Dey, A.K., Bagnell, J.A.: Navigate like a cabbie: Probabilistic reasoning from observed context-aware behavior. In: UbiComp, pp. 322–331 (2008)

Download references

Acknowledgments

This work is supported by Australian Research Council (ARC) Discovery Project DP130104587 and Australian Research Council (ARC) Future Fellowships Project FT120100832.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rui Zhang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xue, A.Y., Qi, J., Xie, X. et al. Solving the data sparsity problem in destination prediction. The VLDB Journal 24, 219–243 (2015). https://doi.org/10.1007/s00778-014-0369-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-014-0369-7

Keywords

Navigation