3 Steps Necessary for Voice Recognition on Your Phone

Voice control of your devices is the “next big thing.” It seems the heavy hitters are all betting that people want and need to talk to their devices. In fact, the Big 3 are all planning intelligent TVs to vie for your input along with your phones and perhaps even your home as AT&T indicated this week (Home Automation). With all this talk about talking to your gadgets, you have to wonder where these thin LCD screens and handheld machines plan to store it all. It turns out the Cloud is uniquely situated to host the intelligence needed to make sense of your request.

What Are We Talking About?

The space is not new; getting your device to recognize speech was hard enough on your home PC. It required steep system resources for a satisfying experience and promptly failed to deliver that experience. The programs failed to keep up with dictation or worse misinterpreted your words, often times to great hilarity.

And now we want our handsets to take over this task.  It takes clever programming, a capable device, and a host of online services to bring sense to the request. It must fully integrate into our lives and be available everywhere we need it. But above all it must be useful. Today this is accomplished by relying on the distributed nature of the service and attempting to juggle 3 things:

  1. Contextual Cognition
  2. Scalability of Intellect
  3. Extensibility

Contextual Cognition

At first it may seem like this new flavor of service is about speech recognition. To be fair, a lot of work has gone into speech recognition these past 20 years, but once you’ve decoded the sound you’re left with deciphering the meaning. Computers struggle with context because they’re unable to process the subtleties of mood, tone, or even colloquial meaning.

The trend is to process this data in the Cloud rather than on the device, which is rapidly becoming irrelevant. Spare cycles on remote servers will process the request and hand back instructions to the client in terms that it can understand.

Scalability of Intellect

Humans can only process so much information at once. I can attest to the fact that a brain can easily overload with something as simple as explaining why it’s important to use soap in the shower to a 10 year old. We’re easily flummoxed by emotional, verbal, and even visual overload to the point that learning and acquiring new skills become a challenge.

Computers are very good at being patient, waiting for the next wave of requests, and scaling to meet the demand. They don’t get upset at tone or get impatient when they make mistakes. Unlike humans they don’t forget and adding more memories doesn’t push out older ones.


As we live and grow, our brains expand, without getting bigger. If we discover we don’t have a particular skill, we learn it. These new AI services via the Cloud will grow with us. If the next great service delivers beignets to your dorm room, that service can be extended to your smartphone easily by teaching the system what to do when you ask for a particular food.

As each user learns something new, that knowledge is added to the collective mind and remembered for future reference. We’re simply crowdsourcing our intellect, just as Google did with its 411 service which trained their speech recognition engines.

Get Your Head in the Clouds

Your brain will live in the cloud and collect millions of bits of data related to you. Your smartphone will tease out the meaning of your thoughts, recognize your needs, and let you get on with the work of life. This is a tool like all other tools, no different than carrying an address book or calculator.

The difference is this address book will remind you to stop and buy milk. But don’t worry; we’re not becoming more machine. Perhaps one day the tool will come alive, but when it does it’ll probably want the same things we want, a good cup of coffee and a reminder to watch American Idol when you get home!

How do you see voice recognition developing?  What would you like to see developed that hasn’t come along yet?  We look forward to your comments.
Jeff Morgan Lead Product Marketing Manager AT&T About Jeff