July, 2007

Jul 07

The Graphical Keyboard User Interface

WIMPy and the Terminal



The history of user interfaces can be very briefly summarized into two distinct eras: the command line, followed by the graphical user interface. Interactions on the command line are very fast, but the set of possible commands is not discoverable. GUIs are essentially the opposite, on both issues. Interactions with graphical user interfaces are slower, but possible commands are given visual affordances, and icons attempt to convey possible commands through metaphors.

The GUI is largely considered superior to command line interfaces that predated it, but that isn’t entirely true. For instance, while I was in college a majority of students preferred Pine (screenshot) over graphical email clients like Outlook. A group of students in a human computer interaction class I was in did an in-depth analysis of the usability of each application. They found that across a wide variety of metrics, like simplicity, system response time, and (most critically) overall time on task, Pine knocked Outlook’s toolbar-customizing-dialog-poping-drag-and-drop socks off.


Instead of trying to conclude which is superior, a GUI or a keyboard-based interface, it is important to note the specific tradeoffs each interface currently makes in terms of the bandwidth of output, and bandwidth of input.

Modern graphical user interfaces are clearly higher bandwidth than text-based command line interfaces in terms of output, but consider the bandwidth of input:

Standard GUIs, with their drop down menus, check buttons, and tree-lists just cannot compare to the range of options that a text interface gives effortlessly. In just five alphanumeric characters, you can choose one out of 100,000,000 possible sequences. And choosing any one sequence is just as fast as any other sequence (typing five characters takes roughly 1 second). I challenge you to come up with a non text-based interface that can do as well. (Command Line for the Common Man: The Command Line Comeback)


Graphical user interfaces often provide keyboard shortcuts to serve as accelerators. But these keyboard shortcuts are not interfaces in themselves, but simply serve as hooks into various parts of the GUI. For instance, consider control-D in Firefox, it simply pops up the bookmark creation dialog box, and suddenly the user has to go back to using the mouse (or awkward tabbing) in order to complete their task.

The Best of Both Worlds

Over the last six months, I’ve been thinking a lot about the work of two designers: Nicholas Jitkoff (Blacktree, creators of Quicksilver) and Aza Raskin (Humanized, creators of Enso). Both have designed user interfaces that exist in the riven between command line interfaces and graphical user interfaces. And both of these applications are a joy to use.


Unfortunately these types of hybrid keyboard/GUI user interfaces have gone largely unexplored by interaction designers. Aside from feed and label navigation in Google Reader, I don’t know of too many other applications that are currently leveraging these types of incredibly streamlined graphical interfaces, designed solely for keyboard input.

How Firefox Could Potentially Leverage Graphical Keyboard User Interfaces

Here are some ideas I’ve had about how several different Firefox features could be designed using a graphical keyboard user interface. Please note that these are all only conceptual mockups, and we currently have no official plans to implement these features for Firefox 3 (although, we may at some point release a prototype extension through Mozilla Labs). If you are an extension developer and are interested in contributing to a project like this, please email me or leave a note in the comments.

All of these mockups show interfaces that are entirely keyboard driven. A keyboard shortcut launches the UI, and the UI is later dismissed by either selecting an item using the arrow keys and hitting enter, or by hitting escape. These interfaces are all modal, and when invoked they occupy large amounts of space on the screen.

For each of these mockups you can click through for a larger version.

Searching the Web (control+k, or alt-alt)

Of all of these mockups, I think keyboard-based Web search would be the most useful. This mockup also features some favicon upscaling code I wrote for another Mozilla Labs project.


In addition to Web search, the “Bookmarks and History” search will likely be more efficient than the current WIMPy ways of accessing bookmarks in Firefox:

The move back to language started with web search engines in general, with Google placing the capstone when its name became the house-hold verb for “typing to find what you want”. In fact, Googling is almost always faster then wading through my bookmark menu (which says there is something wrong with using menus as a mechanism for accessing bookmarks). (Command Line for the Common Man: The Command Line Comeback)

Switching Tabs (control+tab)


Navigating Recent History


Tagging Pages


Acting on Microformatted Content



-Just because the command line predated the graphical user interface doesn’t mean interfaces based on windows, icons, menus and pointers are always superior to interfaces based around using the keyboard for input.

-Designing interfaces based solely around the mouse and standard GUI widgets, and adding keyboard accelerators as an afterthought, does not always result in creating the most effective and streamlined user interfaces for advanced users.

-Interaction designers should consider designing keyboard-based graphical user interfaces, to simultaneously take advantage of both high bandwidth input, and high bandwidth output.

Technorati Tags: