Don’t Talk about Users

… it might sound counter intuitive for a project so intently focused on usability and creating a mainstream browser, but in all seriousness I believe that Mozilla needs to (caveat: “for the most part”) stop talking about users.

Last week I attended a workshop about usability in open source projects. This gave me a chance to meet with designers from a number of other projects and organizations including Ubuntu, KDE, RedHat and OpenOffice.org, as well as members from the HCI research community. We each wrote a position paper prior to attending, and the position papers written by myself and Andy Ko from the University of Washington ended up complimenting each other well. Andy articulates the problem, and I provide a solution.

First, the problem:

100 contentious Firefox bug reports were analyzed for distinct uses of the word “user.” The results show that developers use authoritative words (such as allow, educate, and require) to describe what software does for users. Most statements involved confident speculation about what users need, expect and do, whereas a minority of statements demanded evidence for such unsubstantiated claims. The results also show that when describing users, developers describe them in general terms, rather than referring to particular uses cases or user populations. These results suggest that, at least in the broader Firefox developer community, developers rely largely on stereotype and instinct to understanding user needs. [From How do Open Source Developers Talk About Users]

Now this sounds pretty bad, we are confidently proclaiming speculative statements that lack any form of evidence. It’s perhaps not quite as dire as it initially sounds, since while we are falling back to stereotype (generally “users who are not developers”) and instinct (“we should make this easier”) at least the community is debating changes with some notion of users in mind. However, when you invoke a hypothetical user for use in debate (often someone’s mother), you can pretty much project whatever you want onto that hypothetical user in an attempt to win an argument. Stereotype and instinct make interface design seem arbitrary and subjective, they lack precision.

Here’s a proposed solution to this problem which I co-authored with Daniel Schwartz (you can also read it in the standardized academic conference format). Instead of talking about users, Mozilla should instead focus debate on core user experience principles.

Using a Distributed Heuristic Evaluation to Improve the Usability of Open Source Software

Building tools to enable a large scale distributed Heuristic Evaluation of usability issues can potentially reshape how open source communities view usability, and educate a new generation of user experience designers. We are exploring adding the ability to perform a distributed Heuristic Evaluation into the Bugzilla instance used to develop Firefox. Ideally this approach will build more of a culture around HCI principles, and will create a framework and vocabulary that will cross pollinate to other open source projects.

Introduction

When contemplating how to increase the influence of user experience design in open source communities, a common approach is to attempt to “increase the involvement and visibility of UX professionals” [1]. However, this paper asks a different question: how can we convert current developers in open source projects to have a skill set equivalent to what academia or a corporation would consider to be formally trained user experience professional? The approach we propose consists of embedding HCI concepts and practices directly into the tools that control an open source community’s work flow.

Beyond controlling an open source community’s process and work flow, tools also indirectly shape the community’s values and ideals. This is important, because if social currency in the community is inherently linked to one’s ability to make the software better, the concept of better must be expanded to encompasses “easier to use.”

Quantitatively Measuring Usability

Measurements like the time it takes an application to load, the amount of memory used, or load on the cpu are all trivial to calculate, and wonderfully quantitative. One of the reasons open source communities tend to discount usability (both in practice and in artifacts like the severity descriptions in Bugzilla), is an inaccurate view that usability is an amorphous, and subjective thing that simply can’t be scientifically quantified and measured. However, measuring an application’s usability is an area where previous HCI research can make a strong and very significant contribution to open source development.

The usability inspection technique of Heuristic Evaluation, which was introduced by Jakob Nielsen [2,3,4] has emerged as one of the most common ways for professional user experience designers to evaluate the usability of a software application. Heuristic Evaluations are extremely useful because they formally quantify the usability of a software application against a set of well defined and irrefutable principles. Usability violations can be quantified individually: either an interface supports undo, or it does not, either an interface is internally consistent, or it is not, etc. Usability violations can also be quantified in aggregate: the software application currently has 731 known usability issues. Additionally, by building the tracking system on a set of agreed upon principles, much of the debate on the level of “there is no right or wrong with UI / every user is entitled to their personal opinion / all that matters is the ability to customize” which is currently found in open source software development communities may be significantly reduced. Usability heuristics will help ground these debates, just as currently no one in an open source community argues in favor of data loss, or in favor of crashing.

Injecting HCI Principles into Bugzilla

Adapting an open source community’s bug tracker to capture usability issues defined by a set of specific heuristics can reshape the way developers think about usability. Just as open source development communities currently have a shared vocabulary to describe good and bad with concepts such as performance, data loss, and crashing, usability heuristics can introduce additional concepts, like consistency, jargon, and feedback. All of these concepts, covering both the underlying implementation as well as the user interface, can now have an equal potential to impact the software application at any level of severity, from trivial to critical.

Modifying a bug tracking system to track a Heuristic Evaluation of software is reasonably straightforward. Each issue needs to be able to be associated with the specific usability heuristic being violated (for example: “using the term POSTDATA in a dialog is technical jargon“). We plan to utilize Bugzilla’s keyword functionality, similar to how current bugs can be flagged as violating implementation level heuristics, like data loss. Since the evaluations will be performed by contributors who likely will not have any additional interface design training, it is important that each heuristic is very clearly defined with specific examples and detailed explanations. Additionally, allowing contributors to view all of the bugs in the software marked as the same type of issue, both current and resolved, serves as an effective way for them to further learn about the heuristic.

We are now working to embedded to functionality needed for a distributed Heuristic Evaluation into the Bugzilla instance used to develop Firefox and Thunderbird. These specific modifications may spread to a variety of other open source projects as Bugzilla is currently used by communities including the Linux Kernel, Gnome, KDE, Apache, Open Office and Eclipse [5]. Ideally embedding HCI principles into development tools will also embed the ideals into the community. Similar to other forms of bugs, there will be a social incentive for contributors to locate and classify violations, and there will be a social incentive for other contributors to resolve them. As open source contributors travel between different communities and projects, the usability heuristics will also see a similar cross pollination between open source communities. A shared vocabulary will emerge across open source projects, allowing for clearer communication and debate.

Side Effects of Distributed Heuristic Evaluation

Today the process of Heuristic Evaluation is normally completed in corporations and academia by a small number of designers, who are extremely well practiced at identifying usability issues. However, it is worth noting two important aspects of the Heuristic Evaluation method from when it was first introduced:

Education – First, the method of Heuristic Evaluation has its roots not in the functional purpose of evaluating usability, but rather in the even more basic purpose of teaching usability. We see this in Nielsen’s 1989 SIGCHI bulletin: Teaching User Interface Design Based on Usability Engineering [2] that Heuristic Evaluation was introduced as part of the curriculum for a masters degree in Computer Science. This is still true today: the road to becoming a good user experience designer begins with mastering the identification of well defined heuristics.

Power in Numbers – The second important aspect of Heuristic Evaluations is that it was quickly found that the number of evaluators played a major role in how successful it was. Nielsen wrote in 1990 that “evaluators were mostly quite bad at doing such heuristic evaluations… they only found between 20 and 51% of the usability problems in the interfaces they evaluated. On the other hand, we could aggregate the evaluations several evaluators to a single evaluation and such aggregates do rather well” [3]. For large open source projects, Bugzilla instances often have thousands to hundreds of thousands of users.

Conclusion

In open source software development, the educational and distributed aspects of a Heuristic Evaluation are critically important. While the majority of open source projects currently lack user interface designers capable of performing a perfect Heuristic Evaluation in isolation, that’s irrelevant. The collaborative nature of open source projects allows for a group of contributors to effectively compete with a formerly trained user experience professional by aggregating their abilities. And similar to all of the other ways in which people contribute to open source projects, there is a mutually advantageous feedback loop: the more effort a contributor puts into improving the software, the more they are able to increase their own skill set. Hours spent performing heuristic evaluations and brainstorming ways to address usability issues will allow a new generation of user experience designers to emerge in open source communities, just as rock star software develops are currently forged there.

References

1. Schwartz, D. and Gunn, A. 2009. Integrating user experience into free/libre open source software: CHI 2009 special interest group. CHI EA ’09. ACM, New York, NY, 2739-2742.
2. Nielsen, J. and Molich, R. 1989. Teaching user interface design based on usability engineering. SIGCHI Bull. 21, 1 (Aug. 1989), 45-48.
3. Nielsen, J. and Molich, R. 1990. Heuristic evaluation of user interfaces. CHI ’90. ACM, New York, NY, 249-256.
4. Nielsen, J. 1994. Enhancing the explanatory power of usability heuristics. CHI ’94. ACM, New York, NY, 152-158.
5. Bugzilla Installation List, http://www.bugzilla.org/installation-list

Next Steps

Getting this set up for bugzilla.mozilla.org is now being covered in bug 561262. The initial set includes 17 core principles, which we may expand or modify over time as we apply these to all of the user experience bugs that we are currently tracking. The Firefox UX team will be pushing on these usability heuristics pretty heavily during the development of Firefox 4. Also some other significant open source projects are looking into using a similar approach for their bug databases, so hopefully this shared vocabulary for usability will begin to cross pollinate across the open source community.

10 comments

  1. Robert O'Callahan

    “Stereotype and instinct” sound like loaded terms to me. You could just as easily say “experience”, which sounds a lot better…

  2. Really sorry to everyone that lost a comment due to reCaptcha being configured incorrectly, everything should be working now. (and at some point Firefox should be caching form submissions in history kind of like sent email, serious dataloss and it’s time for some ux-error-prevention!)

  3. A couple of thoughts that I had while reading this… don’t know if they make sense as questions.

    First, a lot of this stuff reminds me of stuff I’ve read in justification of the changes made by Microsoft to the Office 2007 UI. It’s only anecdotal, but pretty much everyone I’ve heard talk about it hates the new ribbon stuff and even after using it for some time finds it harder. I’m hoping this kind of work doesn’t usually end up with that result (or maybe that it’s actually not as bad as I’ve heard). Apparently interfaces that you “explore” are good, but surely it’s not good when you spend a long time exploring the interface when you want to do something.

    Also, looking at the descriptions, there seem to be assumptions that the designer/developer knows what the task/function is. Without hunting for specific examples, I think some of the contentious bugs I’ve seen the argument has been about what the task is (or tasks are).

    Finally, your spreadsheet mentions “user” 33 times, and your solution 12 times, so you’re not doing a very good job of avoiding talking about users…

  4. OK, this is nice and all, but completely unhelpful to me. I have learned from this article that using stereotypes is bad, and at the same time that usability isn’t completely random (and probably individual?) but I still don’t know what I can do better personally, just that there is some project about adding somthing to bugzilla that may be able to help find when I’m doing something wrong, but nothing about how I can do it right. :(

  5. >usability isn’t completely random
    >(and probably individual?)

    Usability is no more individual than topcrash or dataloss are individual. If someone files a bug based on one of the 17 ux-* keywords now active in our instance of bugzilla (and the principle is actually in fact being violated), there will be a direct correlation between fixing the issue and improving the product’s user interface.

    >I still don’t know what I can do better personally

    I’ll be devoting a large number of posts in the future to explaining each of the new keywords, and providing examples of them being violated in Firefox and other applications. Things people can personally do include trying to file as many bugs as possible based on these keywords, fixing issues that have been identified, and using these specific keywords when debating if a particular interface is good or bad.

  6. >You could just as easily say “experience”
    >which sounds a lot better…

    Sounds better, but has the same underlying problems. Experience is vague, hard to reproduce, and hard to communicate. To be clear I’m not saying that we are creating bad interfaces while discussions are based on hypothetical users and experience, I’m saying that these discussions lack precision. Additionally they do not frame debate in a way that can cross pollinate across other open source projects (unless one of us with lots of experience literally goes over there). In reality the vast majority of user experience designers don’t bother to mention these specific keywords in discussions with other UX designers, everyone has them internalized and for someone who has spent a lot of time doing design they sound super obvious. But there are a lot of advantages to quantifying the obvious with a core vocabulary so we can finally put an end to the notion that usability is about personal preference, or as subjective as modern art.

  7. Hi Alex,

    A small suggestion for improve the feedback for the plugins as Adobe Flash. Why not use an image with a touch of humor in case of error? –> http://www.flickr.com/photos/ftosete/4553369959/

    from Madrid, cordial greetings

  8. This is really great; I think introducing relatively concrete UX concepts and vocabulary into our development environment are likely to help improve what we produce significantly. Thanks, Alex!

  9. I like it. Would Bugzilla users at commercial software shops get to use these categories? What would you do with all the false positive usability issues. As in, what do you do when developers submit usability bugs that really don’t seem to be a problems at all. One solution might be to allow developers to vote on those usability issues that they think are most severe.

  10. I think surveys are great way to collect data. For the firefox i think surveys area maybe published.