
If you want to know when new articles go
online,
subscribe to the WebWord.com
Usability Newsletter!
Google
Voice Search
Article by John
S. Rhodes
Abstract
Google Voice Search allows
you to make a telephone call to Google with a search query and get the
results on a web page. The purpose of this article is to briefly describe
the user experience and investigate the usability implications of this
tool.
First Contact
Yesterday (20-May-2002) Dave
Winer pointed to the Google
Voice Search on Scripting News. I
followed Dave's link and was surprised by the idea: You can search Google
using a telephone and get the results in your web browser. As
you can see in Figure 1 below, the instructions are quite simple.

Figure 1
I immediately started trying to reach Google's automated voice search
system. The first several attempts were failures; I couldn't make it
through and was frustrated. However, by being persistent, I made it
through. I tried "Star Wars", "Chewbacca" and "Rhodes"
for my searches. The "Star Wars" and "Rhodes" searches
went well, but "Chewbacca" failed.
The experience was surreal. Here is how the first search went:
- I called and a pleasant female voice said, "Say your search
query."
- I said "star wars" and the female voice said, "OK,
searching."
- Then the female voice said, "Your results are ready."
- Moments later the pop up window (Step 3 in the instructions)
refreshed.
- I saw a Google search on "star wars" and my jaw dropped.
The time from when the voice said "Say your search query"
until it said "Your results are ready" was only about 3 seconds.
Very fast, very amazing.
Deeper Thoughts
I started thinking about how Google is able to tie my phone to my web
browser. How are the two linked? I don't have the answer, but I have a
guess and it isn't magic. Instead, I think that only a limited number of
people can reach the Google Voice Search at any one time. When a query is
released into the system the page is refreshed for anyone watching it.
However, when you are on the phone firing off your searches (quickly
remember!) it just seems like the page is refreshing for you. Nice
cognitive illusion, if that is what they are doing. If not, I really want
to understand the magic.
On a different topic, some people have indicated that single search words yield results based
on more than one word. As you can see in Figure 2, it seems like Google
does get confused this way.

Figure 2
As I stated previously, it
seems that the search results page is updated constantly based on the
queries being entered into the system. If this is true, and you can watch
what other people are doing, then this fits right into the Voyeur
Web framework. The idea of the Voyeur Web is that people can watch
what other people are doing on the web. As I expected, the voyeur web
aspect of the Google Voice Search makes
some people uncomfortable. I'm not really sure if this is a concern
here since we don't know who submitted the query. Indeed, at least a few
people think that it is fun
to watch what others are doing.
Some Questions
I have some questions for
Google about this tool. Suppose that users didn't see the results of their
queries. What if the results were delivered back through a voice interface
for example. How does Google plan on handling the sponsored links? Will
they tell users verbally that the top links are sponsored? Right now this
isn't an issue since the results are highlighted. But, if the results
cannot be seen, only heard, what are they going to do?
I'm curious if this tool can
be or will be tied to the Google API.
I would assume that a lot of developers would drool over an opportunity to
tap into Google via a voice interface. Perhaps I am slow and someone has
already done something like this. I'd like to know more about it, if it
has been done already.
Craig Saila stated that homonyms
and acronyms caused the Google Voice Search to barf. What does Google
plan? I could see how the engine could get better at dealing with acronyms
but I have a hard time understanding how they will deal with homonyms. For
example, if I type the word "crews" that is quite different than
"cruise" and Google would have no problem handling them.
However, when I say either word, they sound the same. How will Google
choose? Or, will they list results for both words? This is an interesting
question I think because there are a
lot of homonyms.
I'm curious as to why Google
decided to use a female voice. There is nothing wrong with the voice or
the fact that it is female. Did they just flip a coin? Did they decide to
use a female voice because research indicates that female voices are more
pleasant?
Closed Loop
The folks at Google are smart.
However, I wonder how much research actually drives their decision making.
This is very much an open experiment on the web and the people will let
them know if they like it or not. It is applied research to the core. With
this in mind, I wonder if we should revisit the research on voice
interface design and usability. This is a hot technology if you consider
how it might be used with cell phones and mobile devices. I love it when
research leads to technology and technology drives research. The loop is
closed at both ends and everyone wins.
The Google Voice Search is
too easy to use. No menus. No keys to push. And, it is fast. Results in
less than three seconds. Their experiment is so simple. Get people to call
and give them results on the web. Close the loop. Make it easy and people
will use it and talk about it.
Imagine using this search
interface from your vehicle. Imagine that you need directions to a party
but you only have a phone number. Well, with the Google Voice Search, the
problem might be solved. First, you call Google and give them a telephone
number as a search query. When you do this, Google will try to find that
number and match it to Yahoo! Maps and MapQuest, as Figure 3
illustrates.

Figure 3
Next, once Google finds the
results, you could choose to get the directions read to you using
text-to-speech conversion. In other words, Google tells you how to get to
the party using the directions supplied by MapQuest. The loop is closed.
You supply one piece of information, Google does the rest, and you get to
the party on time. Perhaps you even are able to get the directions and the
map sent to your email account, which might be available via your phone.
If you imagine this scenario with a wireless PDA, things get really
interesting and many loops are closed. Maybe this is a revenue stream for
Google: You give them your cell phone number or (gasp!) your credit card
number and they deliver the results.
If you think the above
scenario is a fantasy, you're wrong. Google
is working with BMW to do this kind of thing. I thought that I thought
of this first, but I didn't. BMW and Google are working together so that search
terms can be spoken into the car's speakerphone, and search results are
sent to a built screen in the car or on a user's mobile phone or PDA. This
stuff is coming folks. Did you expect Google to be on the edge like this?
Google isn't stupid. Not at
all. They have big
plans and they are on a rampage. Expect more and more good stuff to
come out of the Google Laboratory.
With more and more ideas being presented, more loops will close. Of
course, I think that we're going to see a lot more from Google. Stay
tuned!
p.s. People I know seem to
like Google Sets and Google
Glossary better than Google Voice Search. They think those tool are
more immediately practical.
Further Reading
Comments?
Please send them to me: john@webword.com
I want to know what you think about this article.
What next?
|