The need to test GNOME Shell
For the past couple of weeks I’ve been working with Jon McCann, Jeremy Perry, and Owen Taylor on developing a usability testing plan for GNOME Shell. It’s a work-in-progress, and I wanted to make a quick posting about the effort and where it’s going.
Now as you may have learned from earlier posts on various blog planets, or in Marina Zhurakhinskaya’s GNOME Journal article last November, GNOME Shell defines the user experience for GNOME 3, to address the more fully-networked computing environment of today as well as to reach a wider audience. A great change has occurred in the way technology affects people’s lives since the original GNOME 2 was designed, including the introduction of new and disruptive technologies such as micro-blogging, large-scale social networking, and the increasingly predominant availability and usage of web-based mobile phones. With so many streams of content and information we need to manage, and it’s hard to keep focused and on task on the desktop. GNOME Shell is designed to account for these changes, to make desktop computing delightful and comfortable in spite of the amount of information we need to process in the course of our work on the computer.
The GNOME Shell design is based upon a few assumptions about users’ desktop actions. While many of these assumptions are informed by seasoned, long-time GNOME hackers and designers working on it such as Owen and Jon, it’s important to define those assumptions and test them to make sure that they are valid. We wouldn’t want GNOME 3 and GNOME Shell to be built upon a shaky foundation. GNOME Shell is finally at the point where it’s stable and we’ve got enough functionality in place to run some tests on it.
Figuring out a usability testing methodology
‘So, Mo,’ you might say, ‘why don’t you do some usability tests of GNOME Shell with your swanky usability lab kit, like you did for Fedora Community?’ Well, Fedora Community is an application that, at this point in time, focuses quite heavily on supporting a particular domain: make Fedora package maintainership easier. To test how successfully Fedora Community does this, sure, you do usability testing to see how it goes when folks using the application try to complete tasks related to package maintainership. GNOME Shell, however, isn’t an application. It’s an environment in which a wide spectrum of users are meant to complete tasks within any number of different domains – within a variety of different applications. If I can’t successfully write a book report on To Kill A Mockingbird while using GNOME Shell, for example, how much of that is related to GNOME Shell, how much is related to my word processor, how much of that is related to my internet browser? Or, to analogize, if I built a highway, how helpful would it be to make sure I can make a trip to Grandma’s house, the supermarket, and to the local national park using it?
What are the considerations you should make when building a highway?
- What speeds is the highway meant to support? Make sure turns and ramps are banked appropriately to support these speeds, make sure the angles in the turns are safe to make at those speeds, and that the materials used to build the road can withstand that level of usage.
- Under what weather conditions might the highway be under? (If it’s in New England, think snow and ice! In Oahu, not so much.) Can the chosen material withstand the weather conditions?
- How is the terrain along the planned route? Can the path of the highway be routed such that it avoids dangerous or difficult-to-navigate paths such as along a cliff face?
Can testing the trip to Grandma’s house on the highway expose issues in the design decisions made in building the highway? Maybe if it happens to snow during the trip, you might uncover some issues, but it would really be based on chance. Would you feel safe on a highway tested based on the success of arbitrary trips along it, or would you rather it be tested in a fashion targeted to expose any issues that might be present in the design decisions made in its construction?
Usability testing GNOME Shell to see how well folks can download and listen to music and send emails to their Grandpa I think is like testing a highway’s safety based a trip to Grandma’s house: you might get lucky and incidentally happen to uncover a flaw in the design, but more likely you’re going to cover issues that aren’t directly related to the design (the car I’m driving isn’t comfortable to ride in!) My colleague Ben, who is a seasoned usability professional with over 15 years’ experience, suggested a methodology where we make a catalogue of various design assumptions followed in GNOME Shell’s design thus far and treat them as research hypotheses we could construct targeted tests for. For example, if we assume users find it easier to search for documents than browse for them for a file tree, we could construct the following test:
- Assign a user the following task: “You’re trying to create a print-out calendar for next month to plan your schedule. Download the OpenOffice template for a calendar available at some URL. Now, open up the calendar file.”
- Observe whether or not the user browses the file hierarchy to open the document up after saving it. How long does it take them to find it when asked to find it? How many clicks? Do they seem annoyed?
- If the user browsed for the file, ask them to search for it. If they searched for it, ask them to browse for it. Again, note how long it took them to find the file – how many clicks – and their general mood.
- Present the user with a quick, 3-4 Likert scale-based questionnaire to assess which method they preferred and how they felt about each method.
This way, the tester does not need to rely on chance that the user is going to open a file on the file system and can be prepared for when they do so to observe in a manner focused on a particular design hypotheses. In this example, the tester can be focused on whether or not the user finds searching or browsing more comfortable, so they can filter out other design issues that might just happen to crop out in the process in order to focus on running the test.
This, of course, is not to say more scenario-based and task-oriented usability studies are not useful. I just want a test method that will target specific design assumptions so I can run a series of tests on that assumption to help decision-making, rather than test various tasks and hope some issues related to the particular design assumptions I’m interested at the time to come up. It will be important to do scenario-based and task-oriented usability studies to complement this work however, especially to identify potentially problematic design assumptions we didn’t think to test.
Ideally, we’d do a longitudinal study so we could better get at how users interact with GNOME Shell over time and after they’ve become familiar with it. These types of studies necessarily take a long time, however, and a lot of investment, so we want to make sure GNOME Shell is ready before we get into that. I think we probably need more confidence in the hypotheses we’ve set out to test first, so we know we won’t be wasting our test subjects’ and our own time with a longitudinal study at this point.
The current test plan status – and how you can help
Right now, with the help of other folks, I’ve written up 36 design assumptions / hypotheses to test. The next steps are to figure out some test plans for them. Since testing 36 hypotheses is no small feat, I think first I’ll pick the top ten in priority and develop hypotheses for them first. I’ll be working on that next.
The first cut of the test plan is on the live.gnome.org wiki with those 36 hypotheses. I would love to hear what you think about them and if you have ideas for other GNOME Shell design assumptions you’d like to see tested. I would also love to hear your ideas on how we might devise some tests for each hypotheses. I’ve set up a page for commentary on the wiki so please feel free to add your comments and suggestions! (Although of course you can feel free to leave your feedback in the comments area of this blog post instead.)