As a general rule, you’ll get more detailed feedback the more detailed and functional your prototype. But you don't always need detailed feedback.
If you’re looking for high-level feedback on a concept or idea for example — something like, “This is great” or “This seems unnecessary” — then you can get away with a minimally functioning prototype, like a paper prototype or clickable wireframes made in Sketch or Figma (referred to as a low- or mid-fidelity prototype).
On the other hand, if you need to test a complex feature or a bunch of connected screens that would be really difficult to simulate with simple sketches or wireframes, you'll have to increase the functionality and detail. In other words, you’d need to create a prototype with a higher level of fidelity.
Design fidelity refers to the level of functionality and detail of a prototype, which can be categorized as either low, mid, or high-fidelity.
Achieving the right level of fidelity is a balancing act and depends heavily upon how much detail you need in your feedback to continue moving forward with confidence. The trick then is faking reality just enough so the user buys it, while putting in the minimum amount of effort to create the prototype as possible. As tempting as it may be to present your testers with something beautiful and polished, building something super detailed too early on would require you to make a bunch of assumptions and waste a bunch of time, which is - ultimately - what user testing serves to avoid.
Fortunately, we’ve found users are pretty great at playing pretend. Of course, if your prototype — no matter the level of fidelity — is full of bugs or dead ends, it will pull the user out of the experience and ruin the test.
An easy way to avoid this is to run through your test yourself, start to finish, as well as to recruit a colleague to test the test ahead of time to make sure any rough spots are smoothed out.
Giving users a realistic scenario provides the context they need to successfully perform a task. This is when you’ll want to consider introducing specific constraints that will help frame their thinking.
For example, let’s say you’re doing a test for Airbnb, and you’d like to improve the user experience for guests booking last-minute vacations.
In practice that looks like this:
“Imagine you’re trying to book a trip under a tight timeline. How might you go about getting a house booked in New York for next weekend?”
In the scenario above, the constraints include: booking type (house), location (New York), timeframe (next weekend).
By this point you’ve already defined what you want to understand and created a realistic scenario. Use these details to guide your script.
For example, if you want to find out “if users are able to create an account successfully,” make sure there’s a question in your script for that. Remember: Each statement should have an accompanying question in the script. If we use the example above, we’d want to make sure there was a question like, “How do you think you would create an account?”
As you create the rest of your script, try to imagine what your user will be experiencing and craft your questions accordingly. A good exercise for this is to walk through your prototype step-by-step. The point here is to anticipate what your participants might find difficult so you can ensure you have questions prepared to tease out why they’re finding it difficult.
Thinking about this in advance will give you confidence going into the session, which puts your participants at ease. Not only that, you’ll ask better follow-up questions, keep things on track, and recover more easily if things go way off the rails
Follow-up questions
You don’t need to be a journalist to come up with great follow-up questions, you just need to be curious. For example, when a participant says they like or don’t like something, ask them:
If a participant is surprised by something, ask them:
At the start of a new flow or feature, ask participants:
If a participant breezes through something important, stop them and ask:
Once a user has completed a flow, ask them:
Developed in 1986, the Software Usability Scale (SUS) is a reliable tool for measuring the usability of your product.
When the SUS is used, participants are asked to score the following 10 items on a scale of one to five, with ‘Strongly Disagree’ (1) on one end and ‘Strongly Agree’ (5) on the other.
Including the SUS in every usability test has the following benefits:
In terms of when to ask these questions, we suggest at the end of every script.
One of the best pieces of advice we can offer is to do a couple dry runs of the test on anyone you can find, whether a coworker or a friend. Don’t worry about getting someone who matches your perfect user — dry runs are for making sure your script makes sense and nothing feels awkward or confusing.
If your real participants are participating remotely, a dry run will also give you the opportunity to see if it makes sense without you sitting right next to them.
Finally, use the dry run to time your test. We find between 1 and 1.5 hours to be the sweet spot for getting value while keeping your participant engaged — any longer and everyone in the room starts to get tired. Long tests also create a mountain of recordings to parse after the fact, which can be time consuming and repetitive.
You may discover a problem very early on in your tests and wonder if it’s necessary to watch all five of your testers struggle with the same thing. It’s absolutely not, and here’s why: These aren't scientific tests, but rather opportunities to observe real users interacting with a product you're intimately familiar with.
Whereas a true experiment would require a scientific hypothesis which you either prove or disprove with clean data, user tests are a structured way to discover if something sucks — and sometimes you only need to see it once to know it's worth changing.
Plus, making tweaks on the fly allows you to test things faster and uncover deeper insights. In fact, some of the best insights we’ve gotten were the result of subtle changes we made over the course of several tests.
Here’s how that might look in practice:
Let’s say your first two testers really struggle with finding the "My Account” page. Rather than watching three more people struggle for the sake of data, you could instead choose to fix the problem between tests — by changing the colour of the button, for example — and then ask the next user to perform the same task using the new flow. Should the problem be resolved you’re then free to address the next problem and then the next.
We've often done 3-4 changes between tests, resolving issues from previous tests and testing the new iterations. This often gets us much further than painstakingly adhering to the original test.
Some things you might consider tweaking in between tests include:
Of course, there are situations in which you might not want to change the test. Like, for example, if you’re trying to persuade a stakeholder with data (e.g., “We need to allocate resources to fixing XYZ because testing shows five out of five people struggled with it.”).
Also, if you identify a problem, but the solution doesn’t seem obvious or it’s obvious but it would require big changes — don’t change the test. Instead, let your testers run the test until you have enough data and you’re confident you know what the problem is and why it exists. Only then will the obvious solution be revealed. (We know, it’s cheesy — but it’s also true.)
We've tested entire cohorts where we change nothing, because we don't have a deep enough understanding of the problem, or we need to demonstrate to a stakeholder just how deep the problem runs. We've also done tests where we change lots of stuff, and it proved the solution in one cohort. In short, use your judgement, and treat user testing as an adaptable tool versus a set of rules.