“Far Out: Predicting Long-Term Human Mobility” by Adam Sadilek and John Krumm is an interesting
paper that says you can predict long-term were some will be. There is a good non-technical summary by Camille Sweeney and Josh Gosfield of Fast Company.
The idea is that people move in patterns, and you can predict
where someone will be in the future based on where they are now. The authors
recorded the movements of 703 subjects (307 people and 396 vehicles) from 7 to
1247 contiguous days with the average number being 45.9 days and a standard
deviation of 117.8 days (I’m guessing the 1247 was one of the authors.) They had
33,268 days of location data.
They used Fourier analysis to find the periodicities in
movements of the subjects and used singular value decomposition (SVD), a type of principle
component analysis (PCA), to reduce the dimensionality of the data and to form
predictive weights.
They broke the surface of the globe into triangular cells to
make the locations and movements more finite, and broke up the day into finite
blocks as well. The authors formed the data by breaking up the probabilities
that a subject would be at 11 particular locations by 24 hour blocks and by
days with a separate block for holidays.
Using the past data the authors, formed the predictive
models to predict the locations of people up to 80 weeks in the future. The
results were above 80% accurate and better than their baseline.
Normally, I do not like using PCA for dimensionality
reduction because the top PCA (aka eigen) features may not be the features you need for your modeling goals. For the radar automatic target recognition work, my team and I picked
features manually (using stuff like length and width of targets) to identify
them because we knew the size of the vehicles and physical characteristics we were looking for, but with huge data sets where you may not know what features are best,
PCA could be a better way to go. I have also used discrete cosine transform (DST) for
really large 3D volumes because my PC did not have enough RAM to handle the
matrix transforms of PCA.
I was skeptical of this paper when I first looked at it
because I did not think people’s movement were that regular to be good for prediction, but with their
accuracy people are more predictable then I thought. When you consider the accuracy values you also have to consider most people sleep 6 to 9 hours a night or 42 to 63 hours a week and work 8 hours a day or 40 hours a week and those schedules are fairly regular. So, the authors really need to account for the approximately 72 hours in a week.
A system like this has both applications for marketing and demographics as well as security and defense applications.
No comments:
Post a Comment