{"id":4384,"date":"2020-06-24T11:59:46","date_gmt":"2020-06-24T15:59:46","guid":{"rendered":"https:\/\/blog.mozilla.org\/ux\/?p=4384"},"modified":"2020-06-29T15:51:35","modified_gmt":"2020-06-29T19:51:35","slug":"designing-for-voice","status":"publish","type":"post","link":"https:\/\/blog.mozilla.org\/ux\/2020\/06\/designing-for-voice\/","title":{"rendered":"Designing for voice"},"content":{"rendered":"<p>In the future people will use their voice to access the internet as often as they use a screen. We\u2019re already in the early stages of this trend: As of 2016 Google reported <a href=\"https:\/\/searchengineland.com\/google-reveals-20-percent-queries-voice-queries-249917\">20% of searches on mobile devices used voice<\/a>, last year <a href=\"https:\/\/techcrunch.com\/2020\/02\/17\/smart-speaker-sales-reached-new-record-of-146-9m-in-2019-up-70-from-2018\/\">smart speakers sales topped 146 million units \u2014 a 70% jump from 2018<\/a>, and I\u2019m willing to bet your mom or dad have adopted voice to make a phone call or dictate a text message.<\/p>\n<p>I&#8217;ve been exploring voice interactions as the design lead for Mozilla\u2019s Emerging Technologies team for the past two years. In that time we\u2019ve developed <a href=\"https:\/\/www.theverge.com\/2018\/10\/11\/17961564\/pocket-redesign-listening-amazon-polly\">Pocket Listen<\/a> (a Text-to-Speech platform, capable of converting any published web article into audio) and <a href=\"https:\/\/voice.mozilla.org\/firefox-voice\/\">Firefox Voice<\/a> (an experiment accessing the internet with voice in the browser). This blog post is an introduction to designing for voice, based on the lessons our team learned researching and developing these projects. Luckily, if you&#8217;re a designer transitioning to working with voice, and you already have a solid design process in place, you\u2019ll find many of your skills transfer seamlessly. But, some things are very different, so let\u2019s dive in.<\/p>\n<h2>The benefits of voice<\/h2>\n<p>As with any design it\u2019s best to ground the work in the value it can bring people.<\/p>\n<p>The <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Learn\/Accessibility\/What_is_accessibility\">accessibility benefits<\/a> to a person with a physical impairment should be clear, but voice has the opportunity to aid an even larger population. Small screens are hard to read with aging eyes, typing on a virtual keyboard can be difficult, and understanding complex technology is always a challenge. Voice is emerging as a tool to overcome these limitations, turning cumbersome tasks into simple verbal interactions.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"size-large wp-image-4385 aligncenter\" src=\"https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/how-voice-improves-ux-600x338.jpg\" alt=\"How voice technology can improve the user experience?\" width=\"600\" height=\"338\" srcset=\"https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/how-voice-improves-ux-600x338.jpg 600w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/how-voice-improves-ux-300x169.jpg 300w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/how-voice-improves-ux-768x432.jpg 768w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/how-voice-improves-ux-1536x864.jpg 1536w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/how-voice-improves-ux-2048x1152.jpg 2048w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/how-voice-improves-ux-1000x563.jpg 1000w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/><\/p>\n<p>As designers, we\u2019re often tasked with creating efficient and effortless interactions. Watch someone play music on a smart speaker and you\u2019ll see how quickly thought turns to action when friction is removed. They don\u2019t have to find and unlock their phone, launch an app, scroll through a list of songs and tap. Requesting a song happens in an instant with voice. A quote from one of our survey respondents summed it up perfectly:<\/p>\n<blockquote><p><em>\u201cBeing able to talk without thinking. It&#8217;s essentially effortless information ingestion.\u201c<\/em><\/p><\/blockquote>\n<h2>When is voice valuable?<\/h2>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"size-extra-large wp-image-4406\" src=\"https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/when-where-voice-1000x373.jpg\" alt=\"When and where voice is likely to be used\" width=\"1000\" height=\"373\" srcset=\"https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/when-where-voice-1000x373.jpg 1000w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/when-where-voice-300x112.jpg 300w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/when-where-voice-600x224.jpg 600w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/when-where-voice-768x286.jpg 768w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/when-where-voice-1536x572.jpg 1536w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/when-where-voice-2048x763.jpg 2048w\" sizes=\"(max-width: 1000px) 100vw, 1000px\" \/><\/p>\n<p>Talking out loud to a device isn\u2019t always appropriate or socially acceptable. We see this over and over again in research and real world usage. People are generally uncomfortable talking to devices in public. The more private, the better.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"size-large wp-image-4399 aligncenter\" src=\"https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/comfortable_using_voice-600x300.png\" alt=\"Graph showing Home, In the car, and At a friends house being the top 3 places people are comfortable using voice.\" width=\"600\" height=\"300\" srcset=\"https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/comfortable_using_voice-600x300.png 600w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/comfortable_using_voice-300x150.png 300w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/comfortable_using_voice-768x384.png 768w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/comfortable_using_voice-1536x768.png 1536w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/comfortable_using_voice-2048x1024.png 2048w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/comfortable_using_voice-1000x500.png 1000w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/><\/p>\n<p>Hands-free and multi-tasking also drive voice usage \u2014 cooking, washing the dishes, or driving in a car. These situations present opportunities to use voice because our hands or eyes are otherwise occupied.<\/p>\n<p>But, voice isn\u2019t just used for giving commands. Text-to-Speech can generate content from anything written, including articles. It\u2019s a technology we successfully used to build and deploy <a href=\"https:\/\/help.getpocket.com\/article\/1081-listening-to-articles-in-pocket-with-text-to-speech\">Pocket Listen<\/a>, which allows you to listen to articles you\u2019d saved for later.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"size-large wp-image-4389 aligncenter\" src=\"https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/pocket-listen-usage-600x400.jpg\" alt=\"Pocket Listen usage Feb 2020, United Kingdom\" width=\"600\" height=\"400\" srcset=\"https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/pocket-listen-usage-600x400.jpg 600w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/pocket-listen-usage-300x200.jpg 300w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/pocket-listen-usage-768x511.jpg 768w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/pocket-listen-usage-1536x1023.jpg 1536w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/pocket-listen-usage-1000x666.jpg 1000w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/pocket-listen-usage.jpg 1922w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/><\/p>\n<p>In the graph above you\u2019ll see that people primarily use Pocket Listen while commuting. By creating a new format to deliver the content, we\u2019ve expanded when and where the product provides value.<\/p>\n<h2>Why is designing for voice hard?<\/h2>\n<p>Now that you know \u2018why\u2019 and \u2018when\u2019 voice is valuable, let\u2019s talk about what makes it hard. These are the pitfalls to watch for when building a voice product.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-4402 size-extra-large\" src=\"https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/voice-is-hard-2-1000x563.jpg\" alt=\"What\u2019s hard about designing for voice?\" width=\"1000\" height=\"563\" srcset=\"https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/voice-is-hard-2-1000x563.jpg 1000w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/voice-is-hard-2-300x169.jpg 300w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/voice-is-hard-2-600x338.jpg 600w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/voice-is-hard-2-768x432.jpg 768w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/voice-is-hard-2-1536x864.jpg 1536w, https:\/\/blog.mozilla.org\/ux\/files\/2020\/06\/voice-is-hard-2-2048x1152.jpg 2048w\" sizes=\"(max-width: 1000px) 100vw, 1000px\" \/><\/p>\n<p>Voice is still a new technology, and, as such, it can feel open ended. There\u2019s a wide variety of uses and devices it works well with. It can be incorporated using input (Speech-to-Text) or output (Text-to-Speech), with a screen or without a screen. You may be designing with a \u201c<a href=\"https:\/\/developer.amazon.com\/blogs\/appstore\/post\/f07f0036-656a-4ec9-978b-5455dcdad353\/how-to-shift-from-screen-first-to-voice-first-design\">Voice first mindset<\/a>\u201d as Amazon recommends for the Echo Show, or the entire experience might unfold while the phone is buried in someone&#8217;s pocket.<\/p>\n<p>In many ways, this kind of divergence is familiar if you\u2019ve worked with mobile apps or <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Learn\/CSS\/CSS_layout\/Responsive_Design\">responsive design<\/a>. Personally, the biggest adjustment for me has been the infinite nature of voice. The limited real estate of a screen imposes constraints on the number and types of interactions available. With voice, there\u2019s often no interface to guide an action and it\u2019s more personal than a screen, so request and utterance vary greatly by personality and culture.<\/p>\n<p>In a voice user interface, a person can ask anything and they can ask it a hundred different ways. A list is a great example: on a screen it\u2019s easy to display a handful of options. In a voice interface, listing more than two options quickly breaks down. The user can\u2019t remember the first choice or the exact phrasing they should say if they want to make a selection.<\/p>\n<p>Which brings us to discovery \u2014 often cited as the biggest challenge facing voice designers and developers. It\u2019s difficult for a user to know what features are available, what can they say, how do they have to say it? It becomes essential to teach a systems capabilities but difficult in practice. Even when you teach a few key phrases early in the experience, human recall of proper voice commands and syntax is limited. People rarely remember more than a few phrases.<\/p>\n<h2>The exciting future of voice<\/h2>\n<p>It\u2019s still early days for voice interactions and while the challenges are real, so are the opportunities. Voice brings the potential to deliver valuable new experiences that improve our connections to each other and the vast knowledge available on the internet. These are just a few examples of what I look forward to seeing more of:<\/p>\n<ul>\n<li>More design tools for voice. Already <a href=\"https:\/\/www.adobe.com\/products\/xd.html\">Adobe XD<\/a> allows you to quickly build a prototype with voice interactions.<\/li>\n<li>New ways to communicate and break down barriers between people, cultures and languages. <a href=\"https:\/\/www.youtube.com\/watch?v=nHUizVXnUSo&amp;feature=emb_title\">Google Translate shows the vast potential for voice to bring us closer together<\/a>.<\/li>\n<li>Voice also holds the promise to free us from our screens. This comes out repeatedly in our research with quotes like this:<\/li>\n<\/ul>\n<blockquote><p><em>\u201cI like that my voice is the interface. When the assistant works well, it lets me do what I wanted to do quickly, without unlocking my phone, opening an app \/ going on my computer, loading a site, etc.\u201c<\/em><\/p><\/blockquote>\n<p>As you can see, we\u2019re at the beginning of an exciting journey into voice. Hopefully this intro has motivated you to dig deeper and ask how voice can play a role in one of your projects. If you want to explore more, have questions or just want to chat feel free to get in touch.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the future people will use their voice to access the internet as often as they use a screen. We\u2019re already in the early stages of this trend: As of &hellip; <a class=\"go\" href=\"https:\/\/blog.mozilla.org\/ux\/2020\/06\/designing-for-voice\/\">Read more<\/a><\/p>\n","protected":false},"author":1530,"featured_media":4391,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[318406,9594],"tags":[221,311824,440708,322056,440710],"coauthors":[440705],"_links":{"self":[{"href":"https:\/\/blog.mozilla.org\/ux\/wp-json\/wp\/v2\/posts\/4384"}],"collection":[{"href":"https:\/\/blog.mozilla.org\/ux\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.mozilla.org\/ux\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/ux\/wp-json\/wp\/v2\/users\/1530"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/ux\/wp-json\/wp\/v2\/comments?post=4384"}],"version-history":[{"count":0,"href":"https:\/\/blog.mozilla.org\/ux\/wp-json\/wp\/v2\/posts\/4384\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/ux\/wp-json\/wp\/v2\/media\/4391"}],"wp:attachment":[{"href":"https:\/\/blog.mozilla.org\/ux\/wp-json\/wp\/v2\/media?parent=4384"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.mozilla.org\/ux\/wp-json\/wp\/v2\/categories?post=4384"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.mozilla.org\/ux\/wp-json\/wp\/v2\/tags?post=4384"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/blog.mozilla.org\/ux\/wp-json\/wp\/v2\/coauthors?post=4384"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}