The New York Times has a fun interactive that lets you compete against an algorithm designed to predict which tweets will get the most retweets. The Times also has a story about the algorithm and its implications. Its takeaway:
That an algorithm can make these kinds of predictions shows the power of “big data.” It also illustrates a fundamental limitation of big data: Specifically, guessing which tweet gets retweeted is significantly easier than creating one that gets retweeted.
Sure. But predicting which of two phrasings will get the most retweets is still plenty difficult, and as part of my job is social media, it’s totally a “skill” I supposedly have.
So I was pleased to beat the algorithm 18 to 13. But even in victory, it’s clear that this is a losing gambit for me, over time. First off, there’s a chance I just got lucky, especially since the story says the algorithm usually guesses right 67% of the time. (I’m not sure how the interactive works, and whether they picked a single set of harder examples, if mine were different from others, or what.)
But any case, even if I wasn’t lucky, there’s a single question that to me is the bottom line in terms of humans’ race with machines: which of us do you think will get better faster?
If you do this same kind of test in two years, how much better will I be? Sure, maybe I’ll have improved a bit (although maybe not). But with more data at its disposal, faster processing power, new statistical techniques, etc. the room for an algorithm to improve is far greater.
The same holds for many other kinds of forecasting. That humans still beat algorithms at predicting, say, geopolitical events (I’m making this up) is interesting. But even as we get better, our progress is incremental, linear. It’s the algorithm that’s poised to win most improved.