One of the interesting things you can do with access to anonymized, large scale and accurate cellphone location data, is to measure traffic patterns. Consistent with the Big Data theme, we decided to compute this across the whole US, to measure Total Miles Driven.
Total Miles Driven is estimated by the Federal Government (aka "FRED") in a monthly report, which is generated with about a 35 day delay (e.g., the September total miles driven is published in early November). The information is useful because it is an indicator of:
- Gas Demand
- Insurance Claims (all else being equal, the more people drive, the more accidents they will have, and the worse the insurance companies will perform, as the typically charge a fixed fee for insurance, independent of mileage)
There are two versions of Miles Driven published by FRED: seasonally adjusted, and unadjusted. We worked with the unadjusted, because it is really hard to replicate the seasonal adjustments (the government does not provide detailed steps of how they perform the adjustments).
The results are as close as it gets! The correlation between Advan's Miles Driven to the Federal Government's is 0.92. But Advan's data is available T+1 (i.e., with 1 day delay), so more than a full month ahead of FRED's. Talk about having an unfair trading advantage!
But we did not stop there. The beautiful thing about the breadth of the data available, is that we can also figure out which cellphone devices belong to truck drivers. Or for that matter, which devices live in a given Cebsus Block Group or Zipcode. Or which ones have a certain income level; or which ones buy or service their cars in a Ford dealer, or a Mercedes dealer; and so on. The miles driven can then be computed only for these devices, to generate custom insights that not even the government, or anyone else for that matter, has access to today, but will be unable to live without in the future!
Here are Advan's Trucks miles driven sourced from (the devices belonging to the drivers of) about half the commercial (class 8) trucks in the US:
The possibilities are endless, so free up your imagination and let us know what's the next interesting insight you want us to compute!
It comes in many forms: Speech. Movement. Draw within the lines. Wait, scratch that last one.
We didn’t like that last one either. When it comes to measuring foot traffic, trade areas, migration patterns, demographics, cross traffic, and in general get behavioral metrics from mobile location data, yesterday’s state of the art was to get canned results on some buildings or venues someone else has decided you might be interested in.
What’s the reason for the limitation? In two words: Big Data. When you are dealing with billions of data points you can’t just write a query to extract any information you want because it will be very slow and very costly (sometimes in the hundreds of thousands of dollars).
It’s even worse in our case because we have 6 trillion cellphone observations over the last 5 years! And growing at a rate of over half a trillion every quarter. Gulp.
But the ability to draw any area, be it a building, a retail location, a factory, an oil field, a neighborhood, you name it, and get instant foot traffic measurements for it was making us loose too much sleep. A sure sign we had to do something about it.
So after several months of head scratching, experimentation, optimization, design, testing, design again, and all the other sleep-reducing activities favored by software engineers, we built a tool to be able to do just that. If you thought we would have shown it to the whole world the very next day, you would be right, but... once a nerd always a nerd. It took us 2 full years to open it up to our clients. Well, better late than ever...
So finally last month we announced REveal, a self service website where the user gets full control of the data.
Recite your favorite poem, and voila! True Home Trade area:
- Want to find out where those living in a building you are considering buying work?
- And how has that changed over time?
- What is the true trade area of this Mall?
- Where do the visitors live and work?
- Are more people moving into this allegedly hot neighborhood, or is it all hype?
And on and on. The possibilities are truly endless.
As recently as 10 years ago it would be unfathomable to think that you can crunch 2 Petabytes of data (that’s 2 Million Gigabytes!) in seconds and get detailed answers to any question you can think of. Progress is sometimes pretty close to science fiction!
- Spock: Over and out.
It is everyone's favorite game to perfectly forecast earnings by looking in the rear view mirror. We are trying really hard not to fall into the same trap ourselves.
Here is why: it is very easy to make random predictions, some of which come true, and then look back and cherry-pick the ones that you were right. Really, a monkey can do it. If you give them a banana for each correct prediction they would get pretty fat, fast!
What really distinguishes a good forecasting method from random chance is, the percentage of times you get it right versus the ones you get it wrong. If you are correct more often than not, even 51% correct vs 49% wrong, then you have something to say. Even if you are wrong more often but you have higher conviction (and therefore make more money) when you are correct, that also has value.
Advan's Machine Learning algorithms take the foot traffic data we compute and forecast top-line revenue. We get it right about 60% of the times. 2 out of 3. If that doesn't sound impressive, consider that many quantitative funds can build profitable algorithms from a mere 51% advantage. And in our own "paper trading" backtests, the performance of a long/short neutral portfolio constructed using Advan's foot traffic data has a Sharpe ratio over 2.
Considering we do not claim to be experts in either Portfolio Construction nor Machine Learning, these performance results are pretty good, if we may say so ourselves. The average Hedge Fund has Sharpe under 1 (not to ding Hedge Funds, actual trading is much harder than paper trading).
Having said that, we can't resist the urge to brag about individual hits. Just this once:
• Texas Roadhouse (TXRH):
The consensus estimated revenue in Q3 2019 was $649.2mm. Advan forecasted $651.32 on foot traffic growth of 6.1%. The actual revenue reported on October 28th after the market close was $650.42mm. The stock closed at $50.17 on the 28th and traded up 20% the next day!
Advan's Year over Year reported traffic and TXRH top-line revenue have correlation of 0.7 over the last 8 fiscal quarters; Quarter over Quarter traffic and top line revenue have correlation 0.96. This correct forecast wasn't an accident.
Is Gamestop's traffic up? Is Jimmy John's traffic growing more than Subway's?
Every day some new analysis of cellphone location data portrays to measure the exact foot traffic in one or all of these, and every day we emit a collective gasp at the incredulous claims.
Let's get this quickly out of the way: Gamestop traffic is trending down; Jimmy John is down too and that trend has not changed for 3 years straight; it's also worse than Subway's downward trend, except for some bright, but inconsistent, spots in 2019.
• Jimmy John's (blue) vs Subway (pink):
But that's only the beginning of the story.
First, it is a disservice to the reader to portray that any dataset, and in particular cellphone location data, can estimate within a fraction of a percentage point the actual traffic of a company. With extremely detailed geofencing work, taking into account the hours of operation of every single location, and after testing hundreds of normalizations versus the actual revenue data and versus our partner's (Consumer Edge) credit & debit card transaction data, our research team at Advan can come close. We strive to be approximately right, instead of precisely wrong, as Warren Buffet said.
Second, the actual claims we have been hearing are completely off the mark. Gamestop's traffic for example -- and we have nothing against the company, these are the fundamentals talking -- is down. About 6% down year over year in Q2 2019 in fact, and not looking better in Q3. There are no two ways about it (if you insist on exact numbers, then 6.04% down, but remember, this is approximate).
Jimmy John's is doing better than Subway in 2019 (comparatively speaking; Subway has 13x the traffic), but that trend has started running out of steam in the first 2 weeks of October. Here's hoping it's just a small aberration. More worryingly though, if you go back to 2018 and 2017 the 2 chains' traffic is changing at the same rate, and that rate is downward trending in both cases. Not a bullish sign.
So please, do not believe everything you read without confirming how the analysis was performed. Consider placing more weight on analytics performed by the experts in crunching and normalizing location data for financial performance. Data is good; but incorrect and misleading data is worse than no data.
Advan is the leader in the Big Data geolocation space, enabling participants in the financial industry to analyze foot traffic data across multiple sectors, including consumer services, energy, technology, healthcare, REITS, financials and others. Advan derives its datasets using multi parameter models that analyze cellphone location data crossed with curated geofenced areas.
Top tier institutional investors spanning from quantitative hedge funds to fundamental asset managers have been the main consumers of Advan’s products.