Vector Field k-Means: Clustering Trajectories by Fitting Multiple Vector Fields

Lecture / Panel
For NYU Community

Speaker: Carlos Scheidegger, AT&T Labs


Scientists study trajectory data to understand trends in movement patterns, including the migration of animals, storm tracks, and human mobility for traffic analysis and urban planning. With the advance of tracking technology, the amount of trajectory data being gathered is steadily increasing. In this talk, we present a novel technique for extracting movement patterns from trajectory data that we call vector field k-means. The central idea of our approach is to use vector fields to induce a similarity notion between trajectories. Clustering algorithms typically seek a representative trajectory that best describes each cluster, much like k-means identifies a representative center for each cluster. In contrast, our approach is based on the premise 

that movement trends in trajectory data can be modeled as flows within multiple vector fields, and the vector field itself is what defines each of the clusters. We demonstrate how vector field k-means can be used to mine patterns from trajectory data and present experimental evidence of its effectiveness and efficiency using several datasets, including historical hurricane data, GPS tracks of people and vehicles, and anonymous call records from a large phone company.


Carlos Scheidegger is a Member of Technical Staff at AT&T Labs in Florham Park, NJ. He joined AT&T in September 2009 after finishing his PhD at the SCI Institute and the School of Computing at the University of Utah. In graduate school, he worked in scientific visualization and geometric processing, with applications in mesh generation and point-based surface representations. His  thesis work focused on managing the provenance of scientific visualizations: the process in which the users arrive at the visualizations they design. During that period, he was one of the main designers and developers of  VisTrails, a new visualization system whose goal is to provide scientists with an infrastructure that helps the exploration process inherent in developing effective visualizations. His current research interests are in data visualization, geometric processing, and computer graphics.