Artificial intelligence needs your data, all of it
But there's a catch. In order for A.I. to work its miracles, it's going to need data. Massive amounts of data.
And I'm predicting that we'll willingly give that data. In fact, we're already starting to.
Do you use Siri, Google Now, Cortana or Alexa They work by recording your voice, uploading the recording to the cloud, then processing the words and sending back the answer. After you've got your answer, you forget about the query. But your recorded voice, the text extracted from it, and the entire context of the back-and-forth conversations you had are still doing work in the service of the A.I. that makes virtual assistants work. Everything you say to your virtual assistant is funneled into the data-crunching A.I. engines and retained for analysis.
In fact, the artificial intelligence boom is as much about the availability of massive data sets as it is about intelligent software. The bigger the data sets, the smarter the A.I.
One important area of A.I. innovation is: How do you get enough data
Here's how Andy Rubin wants to get it.
Andy Rubin's Free Dash Cam
Remember Andy Rubin He is the co-founder and former CEO of Android, which Google got its hands on by acquiring his company in 2005. He ran the Android group at Google for years before heading up its robots division and then finally leaving Google less than a year and a half ago.
Rubin now runs an incubator and consulting firm called Playground Global. He's using that company to work on a variety of projects. One of these is reportedly a dashcam that will be given away for free. In exchange for the free dashcam, users would allow the video and other data to be uploaded and used to feed a massive A.I. system, a "real-time visual map of the world."
That's an incredible vision for multiple reasons, and one that has to be taken seriously because Rubin is someone with a track record of bringing his visions to reality on a massive scale.
First, video is the biggest kind of user data. A single user driving around is likely to generate at least 4 gigabytes of data per hour. There are 253 million cars in the U.S. If only 1 percent of these cars is driving with one of Rubin's dashcams at any given time, that's more than 10 petabytes of data in the U.S. alone. Per hour! That's impossible to process now, but by the time this scheme gets off the ground, it could be possible.
Second, a free dashcam is something everyone will want, so such a scheme would put these cameras in a huge number of cars.
Third, because Rubin is talking about an A.I.-generated real-time map, it's probably an idea comparable to Google's Ground Truth, which takes data from satellites, StreetView cars and many other sources and combines them into a coherent 3D, information-rich picture of the world. (If you choose the "Earth" view in Google Maps and zoom all the way in, you can see that it's not a satellite photo, but something that looks like a kind of digital clay.)
Imagine a StreetView and Ground Truth type system that updates the information on a street every time a car drives down. You could theoretically get real-time weather reports, real-time traffic reports, counts on the number of pedestrians, information on whether lines are forming in front of businesses, available parking spots and much more.
A.I. could do all that, but it needs the data. And users will willingly give it up.
Microsoft's SwiftKey
We learned earlier this month that Microsoft plans to acquire the UK-based startup SwiftKey, which makes a keyboard app for Android and iOS used by some 300 million or so people.
The average user may see SwiftKey as a small thing -- a handy keyboard that lets you either type every letter, type until SwiftKey guesses the word you're intending to type, or write by swiping your finger across the letters. In reality, SwiftKey is a marvel of big-data A.I.
SwiftKey uses a neural network system to predict the next word you'll type. It's not just a guess based on probability. It actually tries to understand the context of the sentence.
The brainy software and massive computers behind SwiftKey are hungry for data. They need to know everything every user types every time. In fact, that's a necessary component of what makes SwiftKey so good -- especially if you opt into their cloud-based personalization.
Google's Smart Reply
Google last year rolled out a new feature of the mobile version of its Inbox email app. Called SmartReply, the system offers short, canned replies to your email. By choosing one, the reply is inserted into the reply email, and then you can send it.
SmartReply works, in principle, like SwiftKey. But while SwiftKey predicts what you'll type based on what you're actually typing, SmartReply predicts the words or even complete sentences you'll type based on the email you got.
For example, my brother recently sent me an email talking about how he might like to place a camera on some land he owns some two hours from his house. We had been knocking around ideas about the camera. Google's SmartReply suggested three responses: "Sounds like a plan," "I like that idea" and "I agree." Any of these replies might be good ones. SmartReply also sometimes generates three responses that completely miss the mark.
I won't go into the details, in part because I don't understand them (When Google engineer Anjuli Kannan addressed a crowd of professionals about how SmartReply works at the recent Virtual Assistant Summit in San Francisco, I could tell they didn't understand it, either). But the technology behind SmartReply is monstrously advanced and powerful, despite the fact that its output tends to be stuff like "got it, thanks!" and its purpose is to save you two seconds.
That SmartReply works at all relies on Google's harvesting terabytes of email messages and replies, which they promise no human ever reads.
Why we'll all offer up our data to A.I.
Andy Rubin's dashcam, Microsoft's SwiftKey and Google's SmartReply are examples of where a large number of people would allow their data to be harvested to feed the A.I. systems that need it. In exchange, people get useful and free tools.
But there's an even better reason to feed the A.I. beast -- saving and improving human lives.
Air pollution is estimated to kill some 5.5 million people a year. A new app called AirTick emerged this month from Nanyang Technological University in Singapore. The app uses smartphone pictures to track air pollution.
Smartphone photos can be tagged with time and location. By harvesting thousands of photos a day from major cities, the AirTick app can train A.I.-software to learn how to estimate the amount of smog from the photos. Over time, the A.I. plus the smartphone photo information should enable the system to maintain real-time, neighborhood-by-neighborhood estimates of air quality. That could allow timely alerts for people to go inside when the air quality gets really bad and also provide evidence for citizens to demand cleaner air, say, in factory towns where the air may be especially unhealthful.
Another research project out of the University of California at Berkeley last week published a free app called MyShake that can detect earthquakes. It uses the motion sensors in smartphones to constantly monitor the phones' every movements. The app can tell when motion is caused by an earthquake or from non-earthquake motion.
It's like having millions of seismographs all over the place, rather than dozens or hundreds. Eventually, the system should be able to predict earthquakes faster than current systems.
And yet another new app came out recently for iOS that helps visually impaired people to identify everyday objects. To use it, you simply snap a picture. Artificial intelligence in the cloud analyzes the smartphone photo, figures out what it is, then sends the answer back.
For example, let's say a blind user is shopping for a birthday present at Toys 'R' Us. The user points the camera at a box, and has Aipoly tell the user that it's a Star Wars Lego set. Or while shopping for fruit, the app could tell the difference between a lemon and a lime.
The app works because volunteer users who are not visually impaired snap pictures of random objects and identify them for the system.
Aipoly doesn't work perfectly. But it could if it had enough data.
These three examples show how simply granting permission for organizations to harvest all the data from your phone’s sensors enables you to help save lives and provide an enormous benefit to the visually impaired.
Artificial intelligence can do amazing things, if given massive amounts of data. Whether we're motivated by naked self-interest or the spirit of the greater good, we'll willingly give up our data. All of it.