My Work and Side Projects

What I've learned from A/B testings

Key Achievements

I created an experimentation roadmap, coordinating across marketing, design, engineering, and science team, successfully ran 10+ A/B testings cross the purchase journey from 2023 to 2024.
I spearheaded the development of an A/B test performance dashboard in partnership with BI team to scale A/B test analysis activities and enabled self-service analysis across 4 product teams.
I led a SVP level experimentation sharing session and 2 team-level sessions to promote the culture of experimentation and best practices (total of 200+ people) by working with Bar Raisers and Science team.
megaphone

These are just my personal learnings and experiences. This isn’t advice or a how-to. Every experiment is different, always design and test for your purpose and do your own research! 🙂 Happy testing!

Planning & Design

Define a Clear Hypothesis

Clearly articulate what you're testing and be specific about what you expect to happen. One hypothesis is generally one treatment so that we can measure the impact. ​Example: “Changing the CTA button color from blue to green will increase click-through rate”
ok
What did I learn?
From my experience, I always advocate to only do one treatment test just because the traffic to our page is not big enough to have significant results for multiple treatments, which would require a longer period of time to run and reach statistical significant.

Identify a Primary Metric

Pick one primary success metric (e.g., conversion rate, click-through rate) to avoid formulating a hypothesis after seeing the data. It is okay to have secondary metrics and guardrail metrics to help support your analysis but we should always define one success metrics before the experiment.
ok
What did I learn?
If we set up a guardrail metric, it is important to know that we need the guardrail metrics to also be TRUE to call the experiment a success.

Segment Your Audience Carefully

Use random assignment and ensure each user sees only one variant (A or B).
If there is a specific segment (e.g. customers on certain device, returned customers, customers behavior...), we should also call it out.
ok
What did I learn?
I once specifically designed an experiment for a specific group of people because they are the audience I want to learn from and eliminating other people can reduce noise. To determine which group of people to include in your time, I would recommend thinking about size, potential impact and risk.

Determine Sample Size & Test Duration Ahead of Time

Use power analysis to determine sample size (i.e. how many customers will fall into the experiment) and test duration to avoid underpowered tests and premature conclusions. Generally, smaller sample size will require longer duration to reach statistical significance. Work with your science team or use tools like Evan Miller's calculator can help.

Account for Seasonality & External Events

Avoid running tests during holidays, promotions, or outages unless intentionally testing for those effects.
ok
What did I learn?
After consulting with a bar raiser and Science team, I once ran an experiment during peak promotion time just to take advantage of the traffic that the promotion will bring to our page. We specified this caveat in our experimentation document so that we do not mislead others or create a bias.

Prelaunch

Ensure Instrumentation Accuracy

Double-check your event tracking to prevent data loss or mislabeling.

Post experiment

Document and Share Learnings

Include hypothesis, results, metrics, screenshots, and next steps in a centralized knowledge base.

light

Any thoughts or feedback? I’d love to hear from you!


Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.