Email A/B Testing: What to Test, in What Order, and How to Read Results

Arjun Mehta
Senior Growth Strategist · Reviewed by the GrowwithBA team
EMAIL & SMS5 MIN READUpdated June 2026
THE SHORT ANSWER

Email A/B testing guide: the test priority ladder, sample-size honesty, metrics that matter post-privacy, and turning wins into a playbook.

Most email 'testing' is coin-flipping with extra steps: tiny samples, opens as the verdict, and no record of what won. Real testing compounds — each result feeds a playbook that makes every future send smarter.

Here's the email testing system: what to test first, how to judge it, and how to keep the learning.

Key takeaways

  • Test in impact order: offers and value propositions, then subject/preview, then structure and CTA, then cosmetics.
  • Privacy-era opens are inflated — judge tests on clicks and revenue per recipient wherever possible.
  • Sample size and patience are the test: small lists need bigger differences and repeated confirmation.
  • The deliverable is a playbook of confirmed patterns — untracked wins are just anecdotes.

The priority ladder

Test big levers before small ones. First: the offer and core message — what you're actually proposing moves results more than how it's dressed. Second: subject line and preview text as a unit, since they gate everything. Third: email structure — long versus short, single CTA versus multiple, plain-text feel versus designed. Fourth: send timing and cadence for your list's rhythm. Last: button colors and image swaps — the classic first tests that belong dead last because their effects are usually noise.

Judge honestly

Pick the success metric before sending: clicks for engagement tests, conversion or revenue per recipient for offer tests — opens only for subject tests, and even then with privacy-inflation skepticism. Split randomly, test one variable, and size samples realistically: small lists detecting small differences is statistics fiction, so test bigger swings, or run the same test across multiple sends and look for consistency. A result that flips on rerun was never a result. Flows deserve testing too — welcome and cart sequences accumulate volume that one-off campaigns can't.

Bank the learning

Keep a simple test log: hypothesis, variants, sample, metric, result, decision. Patterns emerge across entries — your list's preference for direct subjects, short emails, Tuesday sends, whatever the data keeps saying — and that becomes the house playbook new campaigns start from instead of re-litigating settled questions. Re-test foundational findings yearly; lists evolve. The compounding is the point: programs that log tests get smarter every quarter, programs that don't run the same experiments forever.

Common mistakes that quietly kill results

These come straight from audits we run every week. If any of them stings, you’re in good company — and the fix is usually faster than you think.

No plain-text-feeling sends. Heavily designed emails scream 'marketing.' A short, plain note from the founder converts shockingly well for winbacks and high-AOV nudges. Test one this month.

Discount-only retention. If every email is a coupon, you've taught customers to wait for one. Mix in usage content, restock alerts, reviews, and founder notes — the brands with the best LTV send value 60% of the time.

Ignoring deliverability until it breaks. Sunset unengaged profiles after 120-180 days. A smaller list that opens beats a big list in spam — and once Gmail flags you, the climb back takes months.

Designing for desktop. 60-75% of opens are mobile. If your hero image is the message and it lazy-loads on a slow connection, you said nothing. Lead with text, single column, buttons at least 44px.

FROM THE TRENCHES

One client's abandoned-cart flow converted at 4.1%. We added a second email with three customer reviews and a photo, nothing else. 6.8%. The discount they were planning would have cost more and converted less.

Quick checklist before you ship

  • Segments: at minimum engaged-90, lapsed, VIP by spend
  • Welcome flow: 4+ emails, first one inside 5 minutes of signup
  • Every campaign has one job and one primary CTA
  • Flows audited this quarter — links, products, offers all current
  • Abandoned cart: 3 touches at 1h / 24h / 72h, second one includes social proof
  • Mobile preview checked on an actual phone before send
  • Revenue per recipient tracked, not just open rate

Frequently asked questions

What should I test first in email?

The thing furthest from certain with the biggest stakes — usually the offer framing or core value proposition, not the subject line everyone defaults to.

How big a sample do I need for a valid email test?

Enough that the winner's margin couldn't plausibly be luck — thousands per variant for modest differences. Small lists: test bold differences and confirm across repeated sends.

Are opens useless for testing now?

Not useless — directional, and still the right metric for subject-line tests. Just confirm meaningful decisions with click and revenue data, since privacy proxies inflate opens unevenly.

Arjun Mehta

Senior Growth Strategist at GrowwithBA. 12 years running SEO, paid media, and retention for ecommerce and SaaS brands from $1M to $100M+. Every guide here comes from live client work — not theory.

Get a free audit from our team →