The Voice-First Revolution Takes Shape for Restaurants

In 2007, Apple launched the first iPhone, a smartphone in every sense of the word, that since its inception, and with every next generation, has continued to turn novel concepts into mainstream technologies, many of which have left their mark on the quick-service restaurant industry.

Perhaps the most impactful innovation made mainstream by the iPhone is the type-n-swipe user interface. Enabling simplified interaction between human and machine, the type-n-swipe UI quickly found its fit in retail and quick service businesses in the form of touch-first self-service kiosks, point of sale (POS) terminals and even mobile ordering apps. It made viewing menus, placing an order and checking out a quick and intuitive experience, to the tune of people accepting, if not preferring, human-machine interaction over human interaction at their favorite stores and restaurants.

However, technology never stands still and touch-first is about to be displaced by the enhanced capabilities of a voice-first revolution.

Voice is the new norm

Almost paralleling how smartphones led the charge for touch-first, virtual assistant technology has brought voice-first to mass consumer acceptance. More specifically, the AI technologies associated with natural language understanding and automatic speech recognition have gained mainstream acceptance in the forms of Siri, Alexa and the Google Assistant. Just a few years back, conversing with technology may have felt awkward, clunky and embarrassing, but today, with the proliferation of voice in mobile devices, cars, smart speakers and PC’s, voice communication with devices has become the new norm. To drive home the point of mainstream acceptance, consider the fact that pop music artist Ed Sheeran recently released a duet with Amazon’s Alexa. Within the home and on mobile devices, voice is becoming the preferred method for interacting with technology and is now pervasive in our culture. So, with consumer acceptance well underway, the voice-first revolution has already begun branching out into the world of enterprise.

Can I take your order?

One particular business segment seems to be finding its voice before others; quick-service restaurants. It’s no secret, quick-serves have been leveraging new technologies that streamline store operations, increase orders and enhance customer loyalty. The type-n-swipe POS, order ahead mobile apps and self-service kiosks are largely responsible for this. However, heavyweights like McDonalds and Sonic have recently started trials with AI-powered voice ordering. The early trials have been focused on the drive thru experience, but this model can be quickly adapted to mobile order-ahead and in-store kiosks. But, what is it about the voice-first experience that has quick-serves so excited?

Quick and natural interaction

While keyboard, mouse and touchscreen are all amazing inventions, none of them are as natural and universal as human conversation. However you slice it, hunting and pecking on a mobile phone app can hardly be thought of as efficient or natural. A well-designed voice user interface can address choice overload and streamline the entire ordering process. Take pizza for example, a relatively simple food to order; swiping and tapping out a pizza order is time consuming because it involves hunting through dozens of size, crust style, sauce, and toppings options to craft the perfect pie. Whereas saying a pizza order only takes seconds, “Large, hand-tossed, pepperoni and mushroom with red sauce, well done.” Quick service should be just that—quick. Voice cuts through pages of menus and dozens of ordering options, streamlines the ordering process and most importantly removes the need for tiny virtual keyboards, thus making it possible to place orders safely when hands are on a steering wheel.

Sensory Inc. adapted their natural language technology to enable voice ordering in a local coffee shop. A recently published video reveals a genuine level of excitement when people are provided with a voice-first ordering experience.

Voice is rich with data

A typical “type-n-swipe” approach provides what the customer wants, but nothing more. The same order placed with voice contains actionable data that can be leveraged in real-time. For example, cutting edge machine learning techniques analyze voice to provide quick-serves with important biometric, demographic, and customer satisfaction information.

Demographic information such as age and gender is quickly generated as customers speak their orders and this information can be used to finely tune recommendations to the customer during the order process. Voice information such as masculine vs feminine or baby-boomer versus millennial enables recommendations more likely to resonate with each specific customer, unlocking a level of menu personalization that could never be achieved with touch.

Various emotional states can also be detected during the ordering process. Voice analysis reveals if the customer is having a positive, neutral or negative experience. This information provides overall satisfaction results that are more effective than online surveys or star ratings. When flagged in real-time, a customer having a negative experience can be given special attention. And, over time restaurants can use analytical information from emotions and demographics to better position and target advertisements, special offers, or visually displayed options. The vast potential and use cases for voice analytics is one of the driving factors that makes voice such an attractive UX technology.

For voice, the time is now

While it seems like just yesterday the quick-service industry started adopting kiosks for engaging with customers, the voice-first revolution is quickly taking shape, and is in fact being openly welcomed across several industries. “We believe natural voice recognition is the future,” said J. Patrick Doyle, former CEO and president at Dominos, in a prior press release about the pizza chain’s use of artificial intelligence. Uber CEO Dara Khosrowshahi commented in a recent interview, “As we move over to more of a mobile device centric world … I think the interaction model with devices is going to be much more voice-based.” Beyond executive buy-in for the technology, consumers are already immersed in voice interactions with devices, and according to industry experts, more than 50 percent of searches are expected to be made via voice by 2020—this means people prefer the convenience of this means of device interaction over tapping and typing. Companies need to be experimenting with voice now. Implementing a voice strategy today ensures they won’t be rendered obsolete by tech-savvier disruptors in the future.

Todd F. Mozer is the president and CEO, and Chairman of Sensory, Inc., a leading supplier of speech and vision technologies. Mozer founded Sensory in 1994 and has successfully raised venture capital to fund its growth to profitability. At Sensory, he has been involved in both corporate and product line acquisitions, and has worked on incorporating speech recognition into the products of companies such as Motorola, Sony, Toshiba, JVC, Mattel, Hasbro, Uniden, Sega, Samsung, Blue Ant Wireless and many of the leaders in consumer electronics.

You might also like...

Also in Operations

Get daily updates on quick-serve industry news, tips, and events.