Recently we ran two usability studies to gain comparative benchmarking data on an existing design and then comparing that to a new one.
Since the new design was yet to be released, we created two InVision prototypes, one for each design. We then created 4 tasks and 5 questions and generated two identical studies, one for each prototype.
Next, we set about running 100 participants though the prototypes, 50 on each, to see if our new design had created a better overall experience for participants.
Disappointment Meets Confusion
An hour after launch we had the results back from the two studies so I set about consuming the reports. It didn’t take me long to see that the average page views per task were higher for our new design.
As I’m sure many of you can attest, it hurts a little when something you’ve put a lot of effort into and believe is better proves to be worse. But in the interest of creating a better piece of software, I swallowed my pride and took a look at some of the highest page count participants to see how we’d failed.
I focussed in on two participants who were large outliers, both having recorded roughly 3 times the amount of page views than their next nearest participant. These two participants alone were enough to elevate the page count averages to the point where the new design was out performed by the old design.
As I watched the videos of these two participants I only became more confused. (more…)