Test your FB ads with slot machines 🎰

Lately, there's been a lot of talk about how to test FB ads. Here's my take using multi-armed bandits 🎰

Recently, I've been testing all our new Facebook ads in a testing CBO at ~10%-15% of our budget. The idea is to filter out bad ads, find the good ads and then scale the good ads in a separate "production" campaign.

If you listened to episode 4 of Triple Whale's ad spend podcast you know what I'm talking about.

However, I'm not sure this is the best way to leverage CBOs or do testing.

Think about it. If you have multiple ideas for an ad, how do you test which ad is best?

The logical thing to do is allocate your budget evenly (like 50/50) across all your ideas and then allocate all the budget towards the best-performing idea when you find a winner. This is essentially what A/B testing is.

But the problem with this solution is that you ultimately spend a lot of money on inferior ads. You determine the best-performing ad idea, but you've spent a lot of money on the inferior ideas to get there.

Wouldn't it be better to continuously decrease your spending on the bad-performing ads and increase spending on the best-performing ads?

This is called the multi-armed bandit (MAB) problem in probability theory. The MAB problem applies when you have a fixed resource (like an ad budget) that you want to allocate between competing options (like ads or ad sets).

Here you have a trade-off between spending your money on acquiring new knowledge (testing ads) versus exploiting existing knowledge (spending money on ads that perform well).

It's the same when you A/B test Klaviyo popups. But instead of wasting money on bad ads, you're wasting impressions from which you could've converted emails.

Do you show your two variations 50/50 until you have a winner, or do you make Klaviyo reallocate the weights as it learns?

This is what CBO does too—allocating budget towards the best-performing ads.

So why filter out ads in a testing CBO just to filter out more ads in your production CBO afterward?

Why not bring all the learning data to the same CBO in the first place?

If you use a similar testing structure, I think it could work better to merge your testing and production CBO into one and let it works its magic.

It gives you a faster testing process, and it allocates your full budget "best" instead of potentially wasting 10%-15% on testing new ads.

What do you think?