Editorial
Efficient mining of platoon patterns in trajectory databases

https://doi.org/10.1016/j.datak.2015.02.001Get rights and content

Abstract

The widespread use of localization technologies produces increasing quantities of trajectory data. An important task in the analysis of trajectory data is the discovery of moving object clusters, i.e., moving objects that travel together for a period of time. Algorithms for the discovery of moving object clusters operate by applying constraints on the consecutiveness of timestamps. However, existing approaches either use a very strict timestamp constraint, which may result in the loss of interesting patterns, or a very relaxed timestamp constraint, which risks discovering noisy patterns. To address this challenge, we introduce a new type of moving object pattern called the platoon pattern.

We propose a novel algorithm to efficiently retrieve platoon patterns in large trajectory databases, using several pruning techniques. Our experiments on both real data and synthetic data evaluate the effectiveness and efficiency of our approach and demonstrate that our algorithm is able to achieve several orders of magnitude improvement in running time, compared to an existing method for retrieving moving object clusters.

Introduction

With the increasing availability of position-aware devices such as GPS receivers and mobile phones, it is now possible to collect and analyze large volumes of location databases that describe the trajectories of moving objects. Well known examples include taxi position data [1], animal movement data [2] and eye tracking data [3].

We address an important data mining challenge for trajectory data: discovering groups of spatial objects that move together for a certain period. We propose a new type of patterns, platoon patterns, that describes object clusters that stay together for time segments, each with some minimum consecutive duration of time. Fig. 1(a) shows an example of a platoon pattern. Wedding party vehicles o2, o3, o4 and o5 move together as a platoon at consecutive timestamps t1, t2, as well as consecutive timestamps t4 and t5.

The discovery of platoon patterns has a range of real-world applications. The identification of common routes among convoys may lead to more effective traffic control and the early discovery of truck platoons may assist traffic planning to avoid congestion. In eye tracking applications [3], the identification of common areas being viewed by a group of viewers can be used in advertising design and movie filming. In ecology, platoon patterns may provide a deeper understanding of animal migrations and in security may assist police to identify suspicious crowd movements.

Several recent approaches for discovering moving object clusters have been reported in the literature, but they are not directly applicable for mining platoon patterns. We use “moving object cluster” as a generic term in our paper.

Previous work has proposed mining of moving objects that travel together for a minimum number of k consecutive timestamps such as flock [4], [5], [6] and convoy patterns [7], [8]. These patterns commonly require that all timestamps are strictly (or globally) consecutive. As pointed out in [9], enforcing timestamp consecutiveness may lead to the loss of interesting patterns. For instance, in Fig. 1(a) with k = 3, there are no convoy or flock patterns, since the four objects split into two clusters at t3 due to a red traffic light, before coming together again at t4. In our opinion, these four objects are an interesting moving object cluster.

Secondly, swarm patterns [9], take an opposite approach and remove any consecutiveness constraint on timestamps. While this provides more latitude with regard to movement of clusters, it may also mine patterns that are overly “loose”. Consider the example in Fig. 1(b) and assume we require at least k = 3 timestamps. Two vehicles (moving objects o2 and o3) might randomly encounter each other at some isolated and non-consecutive times (t5, t37 and t103), e.g. refilling fuel at the same petrol station, or stopping at the same car park. This does not imply that the drivers have a strong association with each other. Although one might avoid outputting this type of pattern by imposing a larger threshold value for the minimum number of timestamps (e.g. k = 4 timestamps), this would risk missing patterns with two objects that do move together over shorter consecutive durations (such as t2, t3 and t4). Another alternative would be to first mine all swarm patterns and then filter the interesting ones. Such an approach is time consuming, however, since the postprocessing constraints are not pushed inside the swarm mining task. Indeed, our experiments will show that the number of swarm patterns can be extremely large but contain only a small proportion of platoon patterns.

Motivated by these issues, we propose a new definition for a moving object cluster called the platoon pattern, which allows the user to control the behavior of the consecutive time constraint to suit particular applications. Compared to the globally consecutive timestamp constraint of the convoy pattern [8], a platoon only requires that the timestamps are locally consecutive. Platoon patterns allow gap(s) in timestamps, but the consecutive time segments must have a minimum length (be locally consecutive). Given (1) a trajectory database with a timestamp-annotated history for moving objects, (2) a threshold for the minimum number of objects mino that must appear in the platoon, (3) a threshold for the minimum number of timestamps mint for which those objects travel together and (4) a threshold for the minimum number of consecutive timestamps minc, a platoon pattern is an objectset and an associated timestamp sequence, denoted as {O : T}, such that |O|  mino, |T|  mint and the timestamps in T are at least minc locally consecutive. Intuitively, minc denotes the minimum duration of a time segment in which objects stay together consecutively. In addition, platoon patterns do not rely on a particular clustering technique for deciding the spatial closeness of objects, which are instead modeled as preprocessing steps (c.f. Section 3 for our problem definition). The objects are required to be clustered.

Compared to the swarm query, with the combination of mint and minc, a platoon query is able to catch the patterns with consecutive timestamps without returning loose patterns. For example, if we set mino = 3 and mint = 3 and minc = 2, then Fig. 1(a) contains the platoon pattern {o2, o3, o4, o5 : t1, t2, t4, t5}. Objects are not considered forming a platoon pattern at timestamp t3 since the spatial distance between o3 and o4 is greater than the maximum distance enforced by the used clustering algorithm. To avoid redundancy in the set of platoon patterns, we employ the notion of a closed platoon pattern. (O : T) is a closed platoon if there is no platoon (O′, T′) for which either i) O  O′ and T = T′ or ii) O = O′ and T  T′. For example, {o2, o3, o4 : t1, t2, t4} is not a closed platoon, since there is the platoon {o2, o3, o4, o5 : t1, t2, t4, t5}.

Platoon patterns can capture the co-location behavior of moving objects for eye tracking datasets. We first explain the nature of an eye tracking dataset. Fig. 2(a) shows a snapshot of a movie containing a dialog between two characters. An eye tracking dataset records trajectories of the viewers' eye movements during the movie. A heat map represents eye tracking data and omits time information. The density in the heat map indicates on which areas users focus their eyes and is shown in Fig. 2(b). Red (dark gray in B&W) areas are those where viewers looked most at the time, green (light gray in B&W) areas received little attention, and non-colored areas were not looked at.

For eye tracking data, the viewers' eye positions equate to objects, while the time dimension of the movie describes how the viewers' gaze varies (how the objects move). Fig. 2 shows that there are three dense regions R1, R2 and R3: where viewers frequently focus their attention. During a conversation between the two characters in a movie, the viewers switch their focus between these two characters. Since there is nothing interesting in the background, we would expect that R1 and R3 should be considered as the “interesting” regions. Region R2 is unlikely to be of interest, as it is simply the result of eye movements between the two characters. The discovery of common eye movement patterns (moving object patterns) has applications in advertising, since they can guide product placement.

Compared to platoon patterns, convoy and swarm patterns are less suitable for eye tracking. Convoy patterns are determined by a globally consecutive timestamp constraint, and regions R1 and R3 would be missed, as it is unlikely that viewers look at the same region consecutively for the whole period (Fig. 2(a)). Swarm patterns have no time consecutiveness constraint, and region R2 will be output (Fig. 2(b)), since it has been visited frequently (but not continuously). Platoon patterns use a local consecutive timestamp constraint, and only patterns in R1 and R3 are output (Fig. 2(c)), since they attract continuous focus.

Efficient mining of platoon patterns in a large trajectory database is challenging. As the number of objects increases, the number of candidate patterns grows exponentially. We propose a platoon closed pattern mining algorithm called PlatoonMiner to address this issue. Four pruning techniques: Frequent-consecutive pruning, Object pruning, Subset pruning and Common prefix pruning reduce the search space. The common prefix pruning rule is also able to directly extract closed platoons during the computation of platoon queries. Our experiments will demonstrate the effectiveness and the scalability of our proposed algorithm. In summary, we make the following contributions:

  • We introduce a more flexible type of moving object cluster pattern, the platoon pattern.

  • We propose a novel efficient algorithm PlatoonMiner for mining platoon patterns.

  • We experimentally show the scalability of PlatoonMiner using real-world and synthetic datasets. Our algorithm can be several orders of magnitude faster compared to a swarm pattern mining algorithm.

Section snippets

Related work

We survey existing work on discovering moving cluster patterns and describe representative methods.

Problem definition

Let TS = {t1, t2,..., tn} be a linearly ordered set of timestamps of a trajectory history (called time space). Let OS = {o1, o2,..., om} be a collection of objects that appear in TS (called object space). An object oi  OS is observed at (possibly non-consecutive) timestamps T  TS. We refer to T as a timestamp sequence and its length is |T|. A trajectory database stores the trajectories of individual objects at distinct time points. A set of moving objects O (called objectset) that travels together as a

Retrieval of closed platoons

The definition of closed platoons suggests a simple way to retrieve all closed platoon patterns. First build an enumeration tree of either the object or the time space, and then traverse this tree. The tree contains every combination of objects (or timestamps) in depth-first search order (DFS) or breadth-first search (BFS) order. The enumeration tree has 2OS (or 2TS) nodes and this exhaustive search has time complexity of O2OS|TS||OS|, since at each node we need to scan TS (OS) to calculate T

Experiments

We conducted extensive experiments to evaluate the performance of PlatoonMiner by using both real-world and synthetic datasets. The efficiency of PlatoonMiner was mainly compared against ObjectGrowth [9] for non-overlapping datasets in 5.1 Evaluation on real datasets, 5.2 Evaluation on synthetic datasets. ObjectGrowth is adopted as the baseline in our experiments as it is the fastest known algorithm that can mine swarm patterns. In Section 5.3, we compare PlatoonMiner versus ObjectGrowth* (c.f.

Conclusions

In this paper we have formalized the concept of platoon patterns. Unlike previously proposed patterns, the platoon query is more flexible and retrieves temporal object clusters according to different levels of temporal consecutiveness. To efficiently discover platoon patterns in a large-scale datasets, we introduced the PlatoonMiner algorithm, which employs four types of pruning rules to discover the set of closed platoons. Our experiment using eye movement data qualitatively demonstrated the

Yuxuan Li received his Ph.D. degree in Computer Science from the University of Melbourne, Australia in 2015. He obtained a Master's degree from RMIT University, Australia in 2010 and a Bachelor's degree from Guangdong University of B.S., China in 2008. His research interests include spatial data mining, uncertain sequential pattern mining, and imbalanced classification.

References (29)

  • Y. Zheng et al.

    Mining interesting locations and travel sequences from GPS trajectories

  • ...
  • T. Judd et al.

    Learning to predict where humans look

  • P. Laube et al.

    Analyzing relative motion within groups of trackable moving point objects

  • J. Gudmundsson et al.

    Computing longest duration flocks in trajectory data

  • M. Vieira et al.

    On-line discovery of flock patterns in spatio-temporal data

  • H. Jeung et al.

    Convoy queries in spatio-temporal databases

  • H. Jeung et al.

    Discovery of convoys in trajectory databases

  • Z. Li et al.

    Swarm: mining relaxed temporal moving object clusters

  • J. Gudmundsson et al.

    Efficient detection of motion patterns in spatio-temporal data sets

  • P. Kalnis et al.

    On discovering moving clusters in spatio-temporal data

  • Z. Li et al.

    Attraction and avoidance detection from movements

  • J. Lee et al.

    Trajectory clustering: a partition-and-group framework

  • Y. Li et al.

    Clustering moving objects

  • Cited by (58)

    • An instant discovery method for companion vehicles based on incremental and parallel calculation

      2023, Physica A: Statistical Mechanics and its Applications
      Citation Excerpt :

      With the development of intelligent transportation, various traffic information collection technologies have become widely implemented in urban transportation, continuously generating massive and real-time spatio-temporal data. Recently, discovering companion vehicles from spatio-temporal traffic data has become a hot research topic issue [1–17]. The term “companion vehicles” refer to the vehicles that travel together in a period of time mined from the massive traffic trajectory.

    • A real-time discovery method for vehicle companion via service collaboration

      2023, International Journal of Web Information Systems
    View all citing articles on Scopus

    Yuxuan Li received his Ph.D. degree in Computer Science from the University of Melbourne, Australia in 2015. He obtained a Master's degree from RMIT University, Australia in 2010 and a Bachelor's degree from Guangdong University of B.S., China in 2008. His research interests include spatial data mining, uncertain sequential pattern mining, and imbalanced classification.

    James Bailey is a Professor and Australian Research Council (ARC) Future Fellow in the Department of Computing and Information Systems at the University of Melbourne. He has an extensive track record in databases and data mining and has been chief investigator on multiple ARC discovery grants. He has been the recipient of five best paper awards and is an active member of the knowledge discovery community. He is an Associate Editor for the journals IEEE Transactions on Knowledge and Data Engineering, Knowledge and Information Systems and Social Network Analysis and Mining. He regularly serves as a Senior PC member for top conferences in data mining and he will be the co-general chair for ACM CIKM 2015 Conference.

    Dr Lars Kulik is an Associate Professor in the Department of Computing and Information Systems at the University of Melbourne. His overall research goal is to integrate spatial information into pervasive computing systems that anticipate, adapt and respond to the needs of users, and provide services based on the user's location and context. His research focuses on spatial algorithms in pervasive computing environments, methods for safeguarding location privacy, efficient algorithms for moving objects and spatial data mining, information dissemination algorithms in sensor networks, and robust algorithms that can cope with imperfection, especially in the context of mobile computing.

    This research is supported under the Australian Research Council's Discovery Projects funding schema (project number DP110100757).

    View full text