I’d been looking forward to Rochelle King writing her book about using data to inform designs (I wrote about using data to inform product decisions a few years ago, which post followed a great conversation with Rochelle).
Earlier this year, Rochelle published “Designing with Data: Improving the User Experience with A/B Testing”, together with Elizabeth F. Churchill and Caitlin Tan. The main theme of “Designing with Data” the book is the authors’ belief that data capture, management, and analysis is the best way to bridge between design, user experience, and business relevance:
- Data aware — In the book, King, Churchill and Tan distinguish between three different ways to think about data: data driven; data informed and data aware (see Fig. 1 below). The third way listed, being ‘data aware’, is introduced by the authors: “In a data-aware mindset, you are aware of the fact that there are many types of data to answer many questions.” If you are aware there are many kinds of problem solving to answer your bigger goals, then you are also aware of all the different kinds of data that might be available to you.
- How much data to collect? — The authors make an important distinction between “small sample research” and “large sample research”. Small sample research tends to be good for identifying usability problems, because “you don’t need to quantify exactly how many in the population will share that confusion to know it’s a problem with your design.” It reminded me of Jakob Nielsen’s point about how the best results come from testing with no more than 5 five people. In contrast, collecting data from a large group of participants, i.e. large sample research, can give you more precise quantity and frequency information: how many people people feel a certain way, what percentage of users will take this action, etc. A/B tests are one way of collecting data at scale, with the data being “statistically significant” and not just anecdotal. Statistical significance is the likelihood that the difference in conversion rates between a given variation and the baseline is not due to random chance.
- Running A/B tests: online experiments — The book does a great job of explaining what is required to successfully running A/B tests online, providing tips on how to sample users online and key metrics to measure (Fig. 2) .
- Minimum Detectable Effect — There’s an important distinction between statistical significance — which measure whether there’s a difference — and “effect”, which quantifies how big that difference is. The book explains about determining “Minimum Detectable Effect” when planning online A/B tests. The Minimum Detectable Effect is the minimum effect we want to observe between our test condition and control condition in order to call the A/B test a success. It can be positive or negative but you want to see a clear difference in order to be able to call the test a success or a failure.
- Know what you need to learn — The book covers hypotheses as an important way to figure out what it is that you want to learn through the A/B test, and to identify what success will look like. In addition, you can look at learnings beyond the outcomes of your A/B test (see Fig. 3 below).
- Experimentation framework — For me, the most useful section of the book was Chapter 3, in which the authors introduce an experimentation framework that helps planning your A/B test in a more structured fashion (see Fig. 4 below). They describe three main phases — Definition, Execution and Analysis — which feed into the experimentation framework. The ‘Definition’ phase covers the definition of a goal, articulation of a problem / opportunity and the drafting of a testable hypothesis. The ‘Execution’ phase is all about designing and building the A/B test, “designing to learn” in other words. In the final ‘Analysis’ phase you’re getting answers from your experiments. These results can be either “positive” and expected or “negative” and unexpected (see Fig. 5–6 below).
Main learning point: “Designing with Data” made me realise again how much thinking and designing needs to happen before running a successful online A/B test. “Successful” in this context means achieving clear learning outcomes. The book provides a comprehensive overview of the key considerations to take into account in order to optimise your learning.
Fig. 1 — Three ways to think about data — Taken from: Rochelle King, Elizabeth F. Churchill and Caitlin Tan — Designing with Data. O’Reilly 2017, pp. 3–9
- Data driven — With a purely data driven approach, it’s data that determine the fate of a product; based solely on data outcomes businesses can optimise continuously for the biggest impact on their key metric. You can be data driven if you’ve done the work of knowing exactly what your goal is, and you have a very precise and unambiguous question that you want to understand.
- Data informed — With a data informed approach, you weigh up data alongside a variety of other variables such as strategic considerations, user experience, intuition, resources, regulation and competition. So adopting a data-informed perspective means that you may not be as targeted and directed in what you’re trying to understand. Instead, what you’re trying to do is inform the way you think about the problem and the problem space.
- Data aware — In a data-aware mindset, you are aware of the fact that there are many types of data to answer many questions. If you are aware there are many kinds of problem solving to answer your bigger goals, then you are also aware of all the different kinds of data that might be available to you.
Fig. 2 — Generating a representative sample — Taken from: Rochelle King, Elizabeth F. Churchill and Caitlin Tan — Designing with Data. O’Reilly 2017, pp. 45–53
- Cohorts and segments — A cohort is a group of users who have a shared experience. Alternatively, you can also segment your user base into different groups based on more stable characteristics such as demographic factors (e.g. gender, age, country of residence) or you may want them by their behaviour (e.g. new user, power user).
- New users versus existing users — Data can help you learn more about both your existing understand prospective future users, and determining whether you want to sample from new or existing users is an important consideration in A/B testing. Existing users are people who have prior experience with your product or service. Because of this, they come into the experience with a preconceived notion of how your product or service works. Thus, it’s important to be careful about whether your test is with new or existing users, as these learned habits and behaviours about how your product used to be in the past can bias in your A/B test.
Fig. 3 — Know what you want to learn — Taken from: Rochelle King, Elizabeth F. Churchill and Caitlin Tan — Designing with Data. O’Reilly 2017, p. 67
- If you fail, what did you learn that you will apply to future designs?
- If you succeed, what did you learn that you will apply to future designs?
- How much work are you willing to put into your testing in order to get this learning?
Fig. 4 — Experimentation framework — Taken from: Rochelle King, Elizabeth F. Churchill and Caitlin Tan — Designing with Data. O’Reilly 2017, pp. 83–85
- Goal — First you define the goal that you want to achieve; usually this is something that is directly tied to the success of your business. Note that you might also articulate this goal as an ideal user experience that you want to provide. This is often the case that you believe that delivering that ideal experience will ultimately lead to business success.
- Problem/opportunity area — You’ll then identify an area of focus for achieving that goal, either by addressing a problem that you want to solve for your users or by finding an opportunity area to offer your users something that didn’t exist before or is a new way of satisfying their needs.
- Hypothesis — After that, you’ll create a hypothesis statement which is a structured way of describing the belief about your users and product that you want to test. You may pursue one hypothesis or many concurrently.
- Test — Next, you’ll create your test by designing the actual experience that represents your idea. You’ll run your test by launching the experience to a subset of your users.
- Results — Finally, you’ll end by getting the reaction to your test from your users and doing analysis on the results that you get. You’ll take these results and make decisions about what to do next.
Fig. 5 — Expected (“positive”) results — Taken from: Rochelle King, Elizabeth F. Churchill and Caitlin Tan — Designing with Data. O’Reilly 2017, pp. 227–228
- How large of an effect will your changes have on users? Will this new experience require any new training or support? Will the new experience slow down the workflow for anyone who has become accustomed to how your current experience is?
- How much work will it take to maintain?
- Did you take any “shortcuts” in the process of running the test that you need to go back and address before your roll it out to a larger audience (e.g. edge cases or fine-tuning details)?
- Are you planning on doing additional testing and if so, what is the time frame you’ve established for that? If you have other large changes that are planned for the future, then you may not want to roll your first positive test out to users right away.
Fig. 6 — Unexpected and undesirable (“negative”) results — Taken from: Rochelle King, Elizabeth F. Churchill and Caitlin Tan — Designing with Data. O’Reilly 2017, pp. 228–231
- Are they using the feature the way you think they do?
- Do they care about different things than you think they do?
- Are you focusing on something that only appeals to a small segment of the base but not the majority?
Related links for further learning:
- https://www.ted.com/watch/ted-institute/ted-bcg/rochelle-king-the-complex-relationship-between-data-and-design-in-ux
- http://andrewchen.co/know-the-difference-between-data-informed-and-versus-data-driven/
- https://www.nngroup.com/articles/why-you-only-need-to-test-with-5-users/
- https://vwo.com/ab-split-test-significance-calculator/
- https://www.kissmetrics.com/growth-tools/ab-significance-test/
- https://select-statistics.co.uk/blog/importance-effect-sample-size/
- https://www.optimizely.com/optimization-glossary/statistical-significance/
- https://medium.com/airbnb-engineering/experiment-reporting-framework-4e3fcd29e6c0
- https://medium.com/@Pinterest_Engineering/building-pinterests-a-b-testing-platform-ab4934ace9f4
- https://medium.com/airbnb-engineering/https-medium-com-jonathan-parks-scaling-erf-23fd17c91166