Voice Assistants Don’t Understand Us. They Should

By Char Adams
New York Times, Sept. 24, 2021

For people like me, the voice technology that is a part of so many people’s everyday lives can feel all but useless. Telling Alexa to play a song or asking Siri for directions can be almost impossible whenever prolonged (“Aaaaaaaaa-lexa”) or chopped (“Hey … Si … ri!”) sounds cause the devices to misunderstand my commands or stop listening altogether.

According to the National Institute on Deafness and Other Communication Disorders, about 7.5 million people in the United States also “have trouble using their voices” because of disorders like stuttering or speech-altering conditions caused by cerebral palsy.

Voice assistants could radically improve our lives. Their inaccessibility could even be dangerous for those with mobile disabilities, who could rely on voice assistants to call for help. Instead, they often fail to understand us.

“My speech is slow, and I slur some words,” said Dagmar Munn, a retired wellness instructor who has amyotrophic lateral sclerosis. She uses a walker with wheels and has dysarthria, in which weakened muscles lead to impaired speech. She said she had trouble using Alexa and Google Assistant, technologies that, as her condition progresses, she may rely on even more for help with tasks like adjusting the temperature in her home and turning on the lights.

“Although I am careful to enunciate and carefully pronounce a command, the device stops listening by my second word. I just can’t speak fast enough to satisfy the preset listening time,” Ms. Munn said. “The novelty quickly wore off when I really needed the device to respond.”

Companies have usually engineered voice technology to cater to uninterrupted speech from “the average North American English voice,” said Frank Rudzicz, an associate professor at the University of Toronto who studies speech, language and artificial intelligence. As a result, diverse speech patterns sometimes sound foreign to voice-enabled devices.

To interpret speech, voice assistants typically convert voice commands into text and compare that text to recognizable words in a database. Many databases historically have not contained reference data collected from those with different speech patterns like slurred sounds and word repetitions. Mr. Rudzicz said that many companies have tried to “reach 80 percent of people with 20 percent of the effort,” using a “default voice.”

In other words, companies have rarely prioritized those of us whose speech doesn’t match what engineers assume to be the norm.

As the national conversation about disability rights and accessibility has grown, some of those companies — including Google, Apple and Amazon — have finally begun to re-engineer existing products to try to make them work for people like me.

Apple has collected more than 28,000 audio clips of stutterers in hopes of improving Siri’s voice recognition systems. Amazon has collaborated with Voiceitt, an app that learns individual speech patterns, to make Alexa more accessible. Microsoft has put $25 million toward inclusive technology. And Google has worked with speech engineers, speech language pathologists and a pair of A.L.S. organizations to start a project to train its existing software to recognize diverse speech patterns.

Julie Cattiau, a product manager in Google’s artificial intelligence team, told me that ultimately, the company hopes to equip Google Assistant to tailor itself to an individual’s speech. “For example, people who have A.L.S. often have speech impairments and mobility impairments as the disease progresses,” she said. “So it would be helpful for them to be able to use the technology to turn the lights on and off or change the temperature without having to move around the house.”

Muratcan Cicek, a Ph.D. candidate at the University of California, Santa Cruz, with cerebral palsy, has a severe speech disorder, cannot walk and has limited control of his arms and hands. He said he tried for years to use Microsoft Cortana and Google Assistant, but they couldn’t understand his speech. After joining Google’s project, he said he was able to use a prototype of the improved Google Assistant.

Despite Mr. Cicek’s success, Ms. Cattiau said that Google’s improved voice technology still has a long way to go until it is ready to be released to the public.

These unfinished efforts — announced in 2019, three years after Google Assistant debuted — demonstrate voice technology’s most pressing problem: Accessibility is rarely part of its original design.

Mr. Rudzicz said that it’s more difficult to alter software after its creation than to develop it with differing abilities in mind in the first place.

When companies don’t prioritize accessibility from the outset, they neglect possible customers and undermine the potential of their diversity efforts.

“We represent a customer base with buying power, a segment that these companies are ignoring,” Ms. Munn said. “I don’t need special handicap tracking devices. I just want the normal devices to understand me better.”

Companies should ensure that voice technology accounts for diverse speech patterns from the moment it meets the market. And disabled communities must be a part of the development process from conception to engineering to devices’ release.

At the very least, all companies must provide the option to extend the listening time of voice assistants — as some have done — so people with speech impediments can speak as slowly or quickly as needed to issue a clear command.

With the right changes, “everything can be voice-enabled,” said Sara Smolley, one of the founders of Voiceitt. “That’s where the power is and where the voice revolution and voice technology is going.”

Disabled communities must be included in that voice revolution. Our voice-enabled world should no longer leave people behind.

Char Adams (@CiCiAdams_) is a reporter covering race and social justice issues for NBCBLK.

Original at https://www.nytimes.com/2021/09/22/opinion/voice-assistants-accessibility-disability.html