It’s the stuff of science fiction. The crew of the Starship Enterprise talks to the replicator in “Star Trek.” Joaquin Phoenix’s character falls for his chatty operating system, Samantha, in “Her.”
But what we’ve seen on the screen is looking more like reality. Talk to your smartphone, Google Glass, even the thermostat – and they talk back.
“It’s actually happening,” said Greg Sullivan, a director of product marketing for Windows Phone at Microsoft.
The gadgets around us are going beyond understanding simple commands and taking part in conversation, albeit one that’s often stilted and programmed.
Ask “How are you, Siri?”
“Excellent!” the iPhone digital assistant will respond.
Talking technology is shifting from novel to useful, and it’s likely that everything from washing machines to driverless cars will be commanded by voice rather than buttons.
Yet as science fiction becomes fact, human users continue to struggle with robo-conversations. It can be awkward talking to a machine and so much can get lost in translation.
“I feel quite shy doing it in front of people,” said Hale, CEO of Minneapolis-based digital marketing agency Nina Hale Inc. “I’m always kind of turning my back.”
Talking to technology isn’t just embarrassing. Most of the programs are dogged by glitches, as well.
Anyone who’s tried dictating a text knows the perils: The microphone picks up background noise, homonyms cause trouble, and tough-to-pronounce names are a lost cause. As someone who works in search engine marketing, however, Hale’s invested in learning how computers process language.
Researchers have been exploring those same topics since the 1950s, said professor Ray Mooney, director of the Artificial Intelligence Laboratory at the University of Texas.
“Really understanding language is hard,” he said. “The human mind evolved over millions of years. It’s hard for us as engineers to reconstruct it in a matter of decades.”
Progress has been incremental, said Mooney, but most of us are finally starting to notice it because the hardware – especially those little smartphone computers in our pockets – are powerful enough to do speech recognition and more natural language processing.
Voice-activated devices usually respond to simple commands, but multiple questions or complicated grammar tends to make programs like Siri go haywire.
That’s because speech – no matter if it’s with a person or a machine – comes with expectations.
The brain treats all voices the same, according to Clifford Nass and Scott Brave, authors of “Wired for Speech.” Even if it’s a computer talking, a human listener will assign gender and personality traits to the voice and make assumptions about trustworthiness based on what they hear.
And if the voice doesn’t deliver?
“Socially inept interfaces suffer the same fate as socially inept individuals: They are ineffective, criticized and shunned,” Nass and Brave wrote.
People tend to like human-sounding interfaces, as long as they’re judged to be competent.
Microsoft embraced that idea with Cortana, the company’s new talking digital assistant. Her answers are colloquial, even if she’s being evasive:
“Do you like the Vikings, Cortana?”
“Y’know, with questions like that, I don’t form opinions,” she’ll say.
While Microsoft is making Cortana as human as possible, Honeywell decided to keep its voice-activated, Internet-connected thermostat more machinelike. While the gadget responds to conversational commands, its voice is mechanical.
“People wanted the device to be a device,” said Tony Uttley, general manager of home comfort and energy systems at Honeywell. “They didn’t want it to sound more human. They wanted to make sure it was still a thing.”
But Honeywell crowdsourced queries and found that people didn’t just want to ask the thermostat to turn the temperature up or down a few degrees. They said things like “I’m hot” or “It’s cold.” Others wanted to know the outdoor temperature or the time of day, so they’ve been updating the gadget’s spoken repertoire.