Sunday, July 21, 2013

Predicting Future Locations Paper by Salilek and Krumm

 “Far Out: Predicting Long-Term Human Mobility” by Adam Sadilek and John Krumm is an interesting paper that says you can predict long-term were some will be. There is a good non-technical summary by Camille Sweeney and Josh Gosfield of Fast Company.

The idea is that people move in patterns, and you can predict where someone will be in the future based on where they are now. The authors recorded the movements of 703 subjects (307 people and 396 vehicles) from 7 to 1247 contiguous days with the average number being 45.9 days and a standard deviation of 117.8  days (I’m guessing the 1247 was one of the authors.) They had 33,268 days of location data.
They used Fourier analysis to find the periodicities in movements of the subjects and used singular value decomposition (SVD), a type of principle component analysis (PCA), to reduce the dimensionality of the data and to form predictive weights.

They broke the surface of the globe into triangular cells to make the locations and movements more finite, and broke up the day into finite blocks as well. The authors formed the data by breaking up the probabilities that a subject would be at 11 particular locations by 24 hour blocks and by days with a separate block for holidays.

Using the past data the authors, formed the predictive models to predict the locations of people up to 80 weeks in the future. The results were above 80% accurate and better than their baseline.

Normally, I do not like using PCA for dimensionality reduction because the top PCA (aka eigen) features may not be the features you need for your modeling goals. For the radar automatic target recognition work, my team and I picked features manually (using stuff like length and width of targets) to identify them because we knew the size of the vehicles and physical characteristics we were looking for, but with huge data sets where you may not know what features are best, PCA could be a better way to go. I have also used discrete cosine transform (DST) for really large 3D volumes because my PC did not have enough RAM to handle the matrix transforms of PCA.


I was skeptical of this paper when I first looked at it because I did not think people’s movement were that regular to be good for prediction, but with their accuracy people are more predictable then I thought. When you consider the accuracy values you also have to consider most people sleep 6 to 9 hours a night or 42 to 63 hours a week and work 8 hours a day or 40 hours a week and those schedules are fairly regular. So, the authors really need to account for the approximately 72 hours in a week. 

A system like this has both applications for marketing and demographics as well as security and defense applications. 

No comments:

Post a Comment