Looking back, looking forward

2009 has been kind.

Professionally it’s been unsurpassed, despite the recession. Clearleft have grown to double figures, moved into a studio with decent wallspace, produced some great work, run two successful conferences and were humbled to be voted Agency Of The Year in the .net awards.

(Personally, I nominate UXCampLondon, Cardiff v Arsenal away and various ATPs, weddings and zombie crawls as additional highlights.)

As the office winds down, colleagues jet off overseas and lunches linger into the afternoon, thoughts turn to gifts and time off. Since I opt out of the commercial trappings of the season, I’ve chosen this year to make my annual donation to WWF and Reprieve, two fantastic clients I’ve worked with this year. I’ll be spending a unique Christmas on a military base. In lieu of ubiquitous WiFi, it’ll be an opportunity to spend time with family, read, write and get my breath back.

2010 will be a year of abundance – and the first casualty, sadly, will be my carbon footprint. I have three speaking gigs booked so far (South by Southwest, the IA Summit and UX London) and as a punter I’m hoping to grab a seat at Paris’s Content Strategy Forum, Berlin’s UXCampEurope and New York’s Design for Conversion. But of course 2010 is likely to be dominated by the book. Emails are a-flying and chapters are a-forming. More on that soon.

Thanks for sharing this year with me and here’s to the next one! Merry Christmas.

{PS. It’s also the done thing to list your favourite albums of the decade. In no order, I’ll throw out MichiganTarot SportChangeTurn On The Bright Lights and Leaves Turn Inside You.}

Cennydd Bowles
I blame the designer

[In which Cennydd has a downright sense of humour failure over a silly web comic.]

Here’s an excerpt of a comic that recently did the rounds in the web design community.

You know what? I’m tired of this attitude.

Clients From Hell is admittedly pretty funny. Sometimes clients say stupid things; but hey, so do designers. I’ve said lots of them myself. But this sort of thing is different. It’s not an amusingly misguided email. Rather, it epitomises a harmful arrogance and entitlement that pervades the design community. It carries a bitter subtext that clients are idiots with no design skill, and it’s a designer’s duty to disempower them by any means possible.

And I’m tired of it. Of course clients aren’t skilled designers; that’s why they had the foresight to hire us. But you know what? They know business. They’re as passionate, committed and talented as anyone. Many of them put their livelihoods on the line to make the web happen. And let’s be blunt: they also pay our salaries.

If a web design project goes to hell this way, I usually blame the designer. He wasn’t skillful enough to make the situation work. He didn’t provide the force of argument required, couldn’t handle the politics, or couldn’t convince the client of the value of good design. On the rare occasion when the relationship with a client goes entirely rotten, the designer should end the relationship gracefully rather than passive-aggressively working to rule.

Unconvinced? I suggest you read Scott McCloud’s excellent post about criticism and the equally insightful comment from Mike L:

“The most common misconception about criticism is that one has to be on a similar skill level as the creator in order to have a valid opinion. I read stuff from many different artists from many different disciplines who cannot abide ramblings of people that couldn’t compete with them in some way. If said person is not an artist, their opinion doesn’t matter. But isn’t art, all art about communication? And who is the artist generally trying to communicate with? … My #1 critic is someone who cannot draw at all. He tells me things I can’t see because I overthink them as an artist.”

(Oh, and here’s what ‘pop’ means.)

Cennydd Bowles
Statistical significance & other A/B pitfalls
 Photo by  snellgrove .

Photo by snellgrove.

Last week I tossed a coin a hundred times. 49 heads. Then I changed into a red t-shirt and tossed the same coin another hundred times. 51 heads. From this, I conclude that wearing a red shirt gives a 4.1% increase in conversion in throwing heads.

A ridiculous experiment (yes, I really did it) with a ridiculous conclusion, yet I sometimes see similarly unreliable analysis in A/B testing.

It’s logical and laudable that designers should seek data in our quest for verifiability and return on investment. But data must be handled with care, and mathematical rigour isn’t a common part of a designer’s repertoire.

Here’s an example from ABTests.com, a worthwhile project that I feel slightly bad to pick on.

The two versions are subtly different:

  • Version A: Upload button bold, Convert button bold, Convert button has a right arrow
  • Version B: All buttons regular weight, no right arrow on Convert button

Although minor changes can cause major surprises, I wouldn’t expect these small differences to improve the form’s usability. With the caveat that I don’t know the users or product, I’d even speculate that Version B could perform worse since it reduces the priority of the calls to action and removes the signifier of progression.

The designer claims that version B showed a 30.4% conversion improvement in an A/B test. Here’s why this isn’t quite accurate.

The role of chance

Any A/B test is a trial, so called because we’re observing evidence gained by trying something out. I can never truly know that there’s a 50% chance of a coin landing as a head or a tail – I can only run trials and observe the evidence. Similarly, we can never truly know that a design leads to higher conversion – we can only run trials and observe the evidence. If that empirical evidence is strong enough, we conclude that the design is an improvement. If not, we don’t.

To be valid, trials need to be sufficiently large. By tossing my coin 100 or 1000 times I reduce the influence of chance, but even then I’ll still get slightly different results with each trial. Similarly, a design may have 27.5% conversion on Monday, 31.3% on Tuesday and 26.0% on Wednesday. This random variation should always be the first cause considered of any change in observed results.

The null hypothesis

Statisticians use something called a null hypothesis to account for this possibility. The null hypothesis for the A/B test above might be something like this:

The difference in conversion between Version A and Version B is caused by random variation.

It’s then the job of the trial to disprove the null hypothesis. If it does, we can adopt the alternative explanation:

The difference in conversion between Version A and Version B is caused by the design differences between the two.

To determine whether we can reject the null hypothesis, we use certain mathematical equations to calculate the likelihood that the observed variation could be caused by chance. These equations are beyond the scope of this post but include Student’s t testχ-squared and ANOVA (Wikipedia links given for the eager). Here’s a site that does the calculations for you, assuming a standard A/B conversion test with a clear Yes or No outcome.

Statistical significance

If the arithmetic shows that the likelihood of the result being random is very small (usually below 5%), we reject the null hypothesis. In effect we’re saying “it’s very unlikely that this result is down to chance. Instead, it’s probably caused by the change we introduced” – in which case we say the results are statistically significant. Note that we still can’t guarantee that this is the right interpretation – significance is about proof only beyond reasonable doubt.

Running the calculations on the above data shows that the results aren’t statistically significant: the evidence isn’t strong enough to reject the null hypothesis that the difference in conversion is simply down to luck. The main problem is the small sample size (128 and 108 users respectively), so I would advise the designer, Johann, to repeat the test with more users. Assuming the observed conversions seen didn’t change (a big assumption) a sample size of approximately 200 users per variant should be sufficient for significance. He could then either reject the null hypothesis or the results would remain inconclusive, in which case there’s no evidence the design has made a difference. In Johann’s defence, he recently posted that he takes the point about significance, and I’m looking forward to seeing more conclusive data for this intriguing test.

Percentage confusion

Significance isn’t the only slippery problem A/B tests face. For starters, quoting conversion improvements is always fraught with difficulty. Since conversion is usually measured in percentages (in this example, 31.3% and 40.7%) there are two ways to quote improvements. We can say that conversions increased by:

  • 9.4% – the difference between the two
  • 30.4% – the amount that 40.7% is bigger than 31.3%*

Any percentage improvement quoted in isolation should be challenged: which of these two calculations has been used? It’s dangerously easy to assume the wrong figure without sufficient context.

The A/B death spiral

A/B tests also suffer from a common quantitative problem, in that they tell us what but not why. I’ve written about this previously in What if the design gods forsake us. It’s wise to back up numerical tests with qualitative evaluation (eg. a guerrilla usability test) so we can make informed decisions if data suggests we need to rethink a design.

Even with backup, sometimes A/B tests are simply the wrong tool for the job. They can provide powerful insight in some cases, but in the wrong place they can be a blind alley or, worse, a weapon of disempowerment. Logical positivism and design don’t mix – not everything we do can be empirically verified – yet some businesses fall back on A/B testing in lieu of genuine design thinking. I call this the “A/B death spiral”, and it plays out something like this:

Designer: Here’s a new design for this screen. You’ll see it has a new navigation style, tweaked colour palette and I’ve moved the main interactions to a tabbed area.
Product owner: Wow, those are pretty big changes for such a high-risk screen. I tell you what: let’s test them individually to see which of these changes works and which doesn’t…

As the proverb suggests, sometimes you can’t jump a twenty foot chasm in two ten foot leaps. Cherry-picking only those design elements that are “proven” by an A/B test can be a route to fragmented, incoherent design. It may earn marginally more money in the short term, but it becomes hard to avoid a descent into poor UX and the long-term harm this causes.

Being faithful to data

Given the potential hazards, I’m concerned about the naïveté with which some designers approach quantitative testing. The world of statistics rewards an honest search for the truth, not dilettantism, and I’d advise any designer moving in statistical circles to pick up some basic stats theory, or at least partner with someone knowledgeable.

A flawed A/B test, be it statistically insignificant, misapplied or misquoted, is nothing more than anecdotal evidence. It’s the same crime as making a website red on the feedback of one user. Yet an impatient designer, seeing the example I quoted above, could quickly jump to a false conclusion: “I should remove arrows from continue buttons: it’s 30.4% better.” Perhaps this designer deserves what he gets. It’s likely he’s only really interested in shortcuts to good UX, and linkbait lists of “Twelve ways to make your site more usable.” Since he understands neither the mathematics nor the context of this trial (timescales, userbase, surrounding task) he will inevitably grab the wrong end of the stick. Nonetheless, he is out there.

Don’t let yourself be that designer.

* subject to rounding

Cennydd Bowles
Q&A: getting into user experience

For the past few years I’ve given an annual talk at UCL to students of the HCI with Ergonomics M.Sc. It’s always a pleasure to share my questionable world view with impressionable minds, and I look forward to the sessions in much the same way as one secretly enjoys a visit from a drunken uncle.

In an effort to make this year’s session a little more interactive, I pulled out an old Knowledge Management set piece:

  1. Distribute post-its
  2. Ask everyone to write one question they wish they knew the answer to (preferably about the topic at hand).
  3. Stick the post-its on the walls. (It’s surprising how much people group them, despite your invitation to use any of the three free walls)
  4. Ask everyone to read each post-it.
  5. If they too want to find out the answer to a question, tell them to mark the post-it with a question mark. If they think they have an answer, mark it with a tick.

It’s not that surprising to find that a room of similarly qualified students share similar concerns. What’s more interesting is that many of them can also help to answer each other’s questions.

The purpose of this exercise is of course to show that networking and collaborating is valuable, and not just a case of awkward conversation and limp handshakes. However, having made this slightly facile point, I realised that most of the posted questions were damn smart and deserved to be shared more broadly. So here are a few that were particularly interesting, and some proposed answers from myself.

Is the graphic design of a site more important than usability when initially attracting users to the site?
I say yes. Research shows users form an opinion on the credibility of a site within milliseconds of visiting it. To form a valid opinion on usability takes use, which may not happen if those impressions are negative. However, the line between the two is of course blurred, and a site can successfully convey usability through layout, visual design and information hierarchy. There are plenty of other factors that have an impact too: load times, content and proposition spring to mind.

How many hours do you work a week?
Define “work”. I’m paid for 37 hours, and most of that is spent on billable client work. But add in commuting, writing articles and conference talks, mentoring, and reading about my field and it would exceed 60. Yes, I’m aware that’s a little unhealthy. Good thing I enjoy it.

What’s the most useless skill you think we’ll learn from this course?
Probably rifling through academic papers to find an authoritative source that proves or disproves a detailed HCI argument. Truth is, not many people in industry will care. It’s more important to judge the the problem at hand and make the right design decisions based on context. HCI theory can give a strong advantage here, but you’ll need to state your case with something more real: usually how your client will make more money by following your advice.

How much do you get paid?
Not telling. But here are some approximate London figures: £25,000 is fair for a graduate-level position, rising to £35–40,000 with a couple of years of experience. Senior people should be looking at £60,000 and up (seven years and above, probably managerial responsibility). Freelance rates typically range between £275-£400/day.

What are the best design tools in HCI?
Thinking, conversation, sketching, software. In that order.

Can you be a good UX designer and a good programmer at the same time?
You can be good at both, yes. But who wants to be just good? Deep specialists tend to better than jacks-of-all-trades, and only extremely rare superheroes can be world class at both. I do, however, strongly recommend that all designers learn to code to a reasonable standard, and that all developers learn the fundamentals of design. Speaking each other’s language is the easiest way to ensure good designer-developer relationships, and one of the easiest ways to become substantially better at your job in a short time.

Do you need to draw well / be arty to be a user experience designer?
Some drawing talent helps, but sketching well is a skill that can be learned and that comes with practice. Its main value is when communicating with clients – a well-crafted sketch can simply convey more information than a poor one. However, it’s more important to develop a designer’s mindset. As Jason Santa Maria says, “sketchbooks are not about being a good artist, they’re about being a good thinker.”

Cennydd Bowles
EuroIA 09 in review

It’s important to accrue tactics to cope with the disruption of travelling. Quick currency conversions, self-conscious squints at unfamiliar coins, departure lounge distractions (ask Alain de Botton). In Scandinavia, I’ve learned to open clearly with “Hello” to announce myself as a foreigner, since the local salutation “Hej” is a homophone with informal English equivalents.

Copenhagen, site of EuroIA 2009, and Malmö, where my evening sofa awaited, share more than greetings, efficiency and cost of living. They are joined by the 7.8km Öresund Bridge, a zoetrope giving glimpses of distant wind turbines in the water.

This sense of mutual destiny – two nations connected by a single structure – feels entirely European. EuroIA was similarly interwoven with shared experiences of linguistically awkward networking and untold cultural unity. The sessions ranged from poor to intriguing (I’m still no fan of the blind review process) but there was something of a BarCamp atmosphere of willing each other to succeed. EuroIA is a gathering of the underdogs, feisty and proud, and it doesn’t have to be the way they write it in the States.

I particularly enjoyed Joe Lamantia‘s peek into the architecture of fun, Sylvie Daumal‘s struggle for acceptance in a hostile environment, and Andrea Resmini‘s intricate analysis of how IA can bridge the real and digital worlds. Perhaps it was a shame that these sessions were book-ended by an American keynote and closer. Their sessions were undoubtedly interesting, but I hope to see a European presence in these elevated slots next year.

My talk The Future Of Wayfinding seemed to be well received. The topic fitted well with the conference theme of Beyond Structure. Topics such as the Semantic Web, ubiquitous computing and what I can only clumsily label ‘unhierarchy’ were prevalent, and I fully expect them to be reflected in next spring’s US circuit.

Next year we visit Paris, capital of a country almost entirely oblivious to user experience work. It seems we Europeans really do pull together in the face of a challenge.

Cennydd Bowles
dConstruct 09 in review

After you build forty or fifty websites there really isn’t any magic in it.

dConstruct’s comfortable niche as the thinking person’s web conference was quickly disrupted by Adam Greenfield’s early remarks. Decrying web and UX design is a risky strategy in a room made largely of web designers and developers, yet it was a thought entirely consistent with our theme of Designing For Tomorrow. The phrase wrapped topics that have been of recent interest to us Clearlefties: ubicomp, gestural interfaces, networked devices and what lies beyond our familiar digital horizons.

Adam led us into a world where information is omnipresent and persistent, where actions stick to identities and the presentation of self is a largely forgotten luxury. A world where objects become services, shared not owned, implies a post-capitalist swing perhaps alluded to by recent economic events. As a recent and voracious reader of Everyware, I was thrilled by Adam’s talk. I’m sure the imminent podcast will reward careful re-evaluation.

Mike Migurksi provided a practical counterpoint with a case history of Stamen’s information design work, with subsequent colour commentary by Ben Cerveny. Ben’s dense, rapid idea stream was perhaps a step too far after such an analytical opening; although Stamen’s work is undeniably excellent, many felt a gap between the metaphysics and the design output, and some of Ben’s more elaborate statements seemed hard to grasp.

Brian Fling explored the mobile field with characteristic flair and pace. Focusing on the future lives of the post-millenials native to the digital age, Brian proposed that history will judge the mobile (and the iPhone in particular) as the flying car we have been waiting for. We are living through a second industrial revolution, based on the portable, personal power of bringing people closer through technology.

Next up, an elaborate Gaia theory of sci-fi and interaction from Nathan Shedroff and Chris Noessel. In an entertaining presentation, the over-used Minority Report example was only (multi)touched upon once, and Jurassic Park’s ridiculous UNIX scene was rightly used for cheap laughs. Of particular interest was the pair’s evidence that anthropomorphism can exist at non-visual levels (consider R2D2’s bleeps and Amazon 1-click servant), although, like Ben before, some other claims seemed rather hazier.

Robin Hunicke, known for her work on “the Maslow’s Hierarchy game known as The Sims”, unfortunately alienated her audience with a spoiler (albeit well meaning) for a film still on general release, and struggled to recover favour. Her West Coast bubbliness sat awkwardly at odds with her academic subject matter, which was coincidentally recapped by August De Los Reyes. Any Microsoft speaker knows he has an uphill battle to win over a sceptical audience; fortunately August’s self-deprecating humour was an instant hit. We imbue objects with intelligence (slide rules, other technological tools), so why not emotion too? Heartbroken families insist on the repair, not replacement, of their Roombas – can we conjure similarly powerful dynamics in the systems we create? August closed with Office Labs’ concept video, a surprisingly rousing vision that raised hairs on necks across the Dome.

The stage was set for a wonderful denouement from Russell Davies, who produced a performance straight from the traditions of British music hall. Russell predicted that digital buildings will give us “Blade Runner brought to you by the makers of Cillit Bang”, and that as technology matures the only way we will escape cliché is to redomain, appropriating ideas from other fields. Russell provided a marvellous reminder that, despite the intelligent contributions of the day, as an industry we are prone to hubris. We’d be daft to disregard the marvellous infrastructure our media predecessors have created.

At its best, the fifth dConstruct was simply outstanding. In its rare low points, it disappointed. As such, it’s at a crossroads. The trend has certainly been cerebral, and this year’s theme certainly encouraged abstract exploration. Early feedback says our audience is happy with this, and that the differentiation from other conferences is an important part of dConstruct’s appeal. Yet there’s always a danger of vanishing into pretension, and the conference must of course appeal to 700+ attendees.

I’m sure Clearleft won’t be taking any snap decisions. dConstruct has become part of the fabric of our company and hopefully the annual schedule, and, in line with our chosen theme for the year, we’ll be thinking carefully about what happens next. I’d love to hear your thoughts on the day and your preferred direction for dConstruct 2010.

Photos: Matt BiddulphFriiSprayTom Jenkins.

Cennydd Bowles
Sweating the small stuff

Outrage. Ikea recently switched corporate typeface, moving from Futura to Verdana across all their marketing, including their printed catalogue and ads.

To typography enthusiasts, this is like Mozart announcing a kazoo concerto. Futura is a type classic, skilfully designed by a master craftsman and demonstrating real artistry. It’s excellent for distinctive identity and brand work – so much so that Ikea had practically made it their own until now.

Verdana was created to act as body text on low resolution computer monitors. And it’s well designed for that purpose, but it doesn’t suit print work or any size above petite. At large sizes it looks plain fugly, with characters that appear juvenile at best. Use of Verdana in this way definitely constitutes bad typography.

The slight is all the greater coming from a company that has, to an extent, brought design into the lives of many people who previously believed it was the domain of turtlenecked pseuds.

Ikea’s reason was ostensibly to ensure consistent use of fonts across web and print platforms, and to ensure global compatibility across all languages. A strange choice, given that Verdana has notable deficiencies in its character set. However, it’s possible that Ikea isn’t as naive as we think. My colleague Paul Lloyd hypothesises that the switch is a deliberate ploy to make the company appear less expensive. It’s an old strategy: cheapen the aesthetic and the perception of price goes down. Plausible, at least.

By all means we can point, laugh and lament the lack of design skill at the company. However, some of the outrage has been ridiculous, particularly since we can never truly know the reasons behind the choice. Hell, there’s even a petition to reverse the change.

I believe that if companies make bad design choices that’s their prerogative. If I worked for Ikea, I would have fought tooth and nail to dissuade them from this choice – but no, I won’t sign a petition. Let them eat cake, and if design is as important as we say it is, the market will prove their mistake.

Herein lies my bemusement at the design community’s reaction. Behind the indignation, does any of us really believe that this typographic gaffe will affect Ikea’s sales? Is it really as egregious an error as we make out? Or are we merely acting out the stereotype designers fight so hard to shake off: the aforementioned turtlenecked pseud complaining that their soup isn’t hot enough?

Typography matters. Used well, it can elevate communication in astonishing ways. But, asAegir points out, there are bigger design challenges facing Ikea and indeed the global manufacturing industry than choice of corporate typeface.

Design is about sweating the big stuff; hopefully even changing the world. Often that involves the small stuff too, but focus solely on the trivia and it’s hard to avoid becoming trivial yourself.

Cennydd Bowles
Lessons from UXCampLondon

Since Saturday’s UXCampLondon I’ve been thinking about what I took from the experience.

One

The devil is in the details. With such a discerning audience, we had to offer something well run and as seamless as possible. We succeeded, thanks to accurate estimation of various factors including no shows, time between sessions, budgets, and the apparently inevitable delay caused by a GPS-less taxi driver. This attention to detail was entirely down to the commitment of our wonderful volunteers, upon whom I relied to orchestrate the minutiae. Delegation was my preferred tactic, as noted by Johanna in her closing notes.

Two

You can’t live blog a conference you’re running.

Three

There’s something about user experience designers. We took an early decision that UXCampLondon would be a one-dayer since the field is generally slightly older, more interested in spending a Sunday with their family than slumming it on an office floor. This upset a few purists (“It’s not a BarCamp if you don’t stay over!”) but was indisputably the right choice.

Many people commented that UXCampLondon had a unique atmosphere: enthusiastic, yet mature and urbane compared with the (admittedly enjoyable) rough bluster of most BarCamps. It further convinced me that user experience folk are my people: highly likeable but intelligent and well balanced; opinionated yet open to alternative views.

Four

Free alcohol cures all ills.

Five

The best lessons are often hidden. In some ways, I didn’t get that much from UXCampLondon because my mind was always elsewhere and I attended few sessions. But that overlooks the other benefits I took from the day. In particularly, I got further proof of the growing strength of our community, and further experience in handling difficult situations (we had plenty).

A couple of people have asked if I’m planning a sequel. It’s possible, but not for a while. I’m taking some time off, and I’m sure there are many other people well suited to running UXCampLondon2.

Thanks to our volunteersour supporters and of course all the attendees for making UXCampLondon a success.

PhotosRob Enslin and Adam Charnock.

Cennydd Bowles