Тара Мейер
Май 12, 2026
If you’re running user acquisition (UA) for a mobile game this year, ad creative testing has never been more complex, or more critical. The creative arms race is real. The new question isn’t about producing enough creatives, but rather if you can actually test them properly and funnel out the best ones.
In a recent episode of Tenjin’s ROI 101 podcast, Marketing Director and host Roman sat down with Matej from Two and a Half Gamers to unpack and understand what new challenges are being faced by creatives. One of the newer, most common challenges in ad creative testing is dealing with volume: there’s too many creatives and not enough budget to test them all meaningfully.
Matej shares his 12 years of gaming industry experience, learnings from a 10-person motion design team, and a no.code playable tool called Playable Maker to the conversation. You’ll discover one of the most practical, useful breakdowns of creative testing frameworks we’ve heard in a long time.
The Scale Problem: Why Ad Creative Testing Has Never Been Harder
The high volume and number of creatives being made already shout the story. The biggest mobile game studios are no longer producing hundreds of creatives per month, rather multiples, into the tens of thousands. Roman put it plainly really early on in the conversation:
“We always discuss this with Mayong, Kingshot, Royal Kingdom, and all the big companies. Some games have tens of thousands of creatives per month. At least in the last 30 days, we’re seeing numbers like 23,000 creatives. That’s a lot.”
And that number is quite high. Still, with an increase in volume comes an increase in noise, especially if you can’t do anything with so many creatives. Matej is realistic:
“I don’t think you can test them all properly. You’d need to spend a massive amount of money.”
This problem of plurality, ever increasing volumes is one of the most common challenges faced by the entire creative industry learning into AI and automation right now. There’s a massive gap between what studios can produce with the help of automated workflows, and what they are able to realistically review and evaluate. More ad creatives doesn’t necessarily translate into better conversions, clicks, or overall performance.
This is exactly why ad creative testing frameworks matter. Matej walks us through his process and shares tips for creating your own route.
The Boring Test as a Creative Testing Filter
Before any creative even reaches a testing campaign, Matej runs it through what he calls the “Boring Test” or the “So What?” test. It’s a pre-evaluation filter that sits upstream of any formal A/B ???????????? of ad creatives. It’s just as simple as it sounds.
“I look at the creatives my team produces, and after a couple of seconds, if I think ‘ah, I’m bored’ — that’s a flag. If the creative isn’t triggering any emotion, or isn’t sending a clear message, I ask myself: ‘So what? What should I do? What’s the message?'”
Applying this filter to your work saves a lot of time. For Matej:
“This pre-evaluation alone filters out maybe 50% of the creatives, which we then send back for revision.”
At this stage, you can’t really be on the lookout for perfection. Instead, the bar is set relatively low at passable, that’s good enough. He elaborates what he means:
“‘Good enough’ means maybe 70% polished: done as a concept, not boring, and clearly communicating an emotion or action without making you think ‘so what?'”
For studios producing thousands of creatives a month, the art of pre-screening would be nearly impossible to do manually. However for teams producing a manageable handful per week, it’s a powerful first line of defense against wasted расходы на рекламу.
How to Build Creative Testing Frameworks That Actually Work
Once a round of creatives passes the “Boring Test”, the next question becomes all about structure and strategy. What does your testing protocol look like? What is the frequency, what elements are you using to judge and measure success?
The answers will vary greatly depending on your channel and category. However, you can use these four different flows and frameworks as a starting point.
Framework 1: IPM Setup (Small-to-Mid Volume)
The IPM setup is the most accessible entry point into structured ad creative testing. Run a dedicated campaign in lower-cost markets — India, the Philippines, Brazil — use IPM (Installs Per Thousand Impressions) as your winning metric, and promote the top performers to your business-as-usual campaign. For teams producing around 10 creatives per week, it’s a clean, manageable system:
“You run a creative testing campaign — let’s say in India, the Philippines, or Brazil — and pick winners based on IPM. Then you take those winners and move them to your business-as-usual campaign. This works well when you’re producing maybe 10 creatives per week.”
But here’s where a lot of teams hit a wall. IPM measures install volume, not purchase intent and those are very different things. A creative that dominates in a Tier 3 testing environment doesn’t always behave the same way when the optimization shifts:
“The problem is that top creatives from the testing campaign don’t always pick up traffic in the business-as-usual campaign, because the optimization is different.”
Recognising that gap is what separates teams that test creatives from teams that actually learn from them.
Framework 2: Multi-Step Framework (Mid Volume)
The multi-step framework exists to solve the inbetween. So, instead of moving IPM winners directly into the business-as-usual campaign, you add an intermediate validation layer like an app event optimization campaign that brings you much closer to real-world conditions:
“You take those IPM winners and test them in an app event optimization campaign — for example, optimizing for purchases. This is closer to your business-as-usual setup, since you’re optimizing for payers. This works well when you’re producing 30+ creatives per week.”
Think of it as a two-round knockout. Round one: the IPM campaign, which eliminates the obvious losers quickly and cheaply. Round two: the app event optimization campaign, which stress-tests the survivors against the metric that actually matters: purchase behavior. Only the creatives that perform well in both rounds earn a place in your main campaign.
This is a more expensive and time-consuming process than the IPM setup alone, but for teams producing 30 or more creatives per week, it’s worth it. The additional validation step significantly reduces the risk of promoting a creative that looks good on paper but fails to deliver ROAS where it counts.
Framework 3: Value-Optimized Campaign (Large Scale)
When you’re producing hundreds or thousands of creatives per month, the multi-step framework starts to break down. It’s simply too slow and too costly to run every creative through a two-stage validation process at that volume. This is where Matej pivots to a value-optimized campaign structure:
“I’ve been coming back to a value-optimized campaign structure. On Facebook, you can have up to 50 creatives per ad set, which gives you a solid range across creative types. You use worldwide or Tier 3/Tier 4 countries as your testing batch.”
The logic here is different from the previous frameworks. Rather than moving creatives through sequential stages, you’re letting the algorithm do the heavy lifting. Facebook’s value optimization is designed to find users most likely to make high-value purchases, which means the testing environment is closely aligned with your goals. You’re not testing for installs and hoping it translates. You’re testing for value from the start.
The worldwide targeting is a deliberate choice too. More markets means more data, faster and this is exactly what you need when you’re evaluating creatives at scale:
“This worldwide + value-optimized setup has been producing strong results recently. From there, you either move winners to your business-as-usual campaign or scale the testing campaign directly if it’s already delivering.”
That last point is really worth highlighting. If a creative is already performing well in the value-optimized testing campaign, there’s an argument for scaling it directly rather than moving it through another layer. Since the campaign is already optimizing for the right signal, the smartest move is to invest in it more and let it run.
Framework 4: Direct to Business-as-Usual (Very Small Scale)
At the other end of the spectrum, there’s a scenario where a dedicated testing campaign doesn’t make sense at all. For smaller studios or accounts producing just one to three creatives per week, the infrastructure required for a separate testing setup simply isn’t justified:
“For smaller clients or smaller accounts producing one to three creatives per week, I skip the testing campaign entirely. Every new creative goes straight into the main business-as-usual campaign, since there’s not enough volume or budget for a separate testing setup.”
This is a pragmatic decision. Running a separate testing campaign requires budget, time, and enough creative volume to generate statistically meaningful results. If you’re producing three creatives a week, you don’t have any of those things in sufficient quantity. Splitting your limited budget across a testing campaign and a business-as-usual campaign would likely hurt performance in both.
The business-as-usual campaign, which typically targets Tier 1 geos like the US, becomes the testing ground by default. New creatives enter, the algorithm evaluates them against existing assets, and natural selection takes over. It’s a slower feedback loop, but for this scale of operation, it’s the right one.
What’s interesting is that even within the business-as-usual campaign, Matej pushes against conventional platform guidance when the data supports it:
“I’ve found that refreshing creatives two or three times a week — even against Facebook’s best practices — is actually beneficial for performance.”
Facebook’s official guidance typically recommends giving campaigns time to stabilize before making changes. But creative fatigue is real, and at a certain point, letting a tired creative continue running is more damaging than the disruption of refreshing it. Matej’s experience suggests that for mobile games in particular, staying ahead of fatigue is worth the trade-off. It’s a reminder that лучшие практики are a starting point, not a rulebook.
Facebook Creative Testing and Beyond: Why Each Channel Needs Its Own Strategy
One of the most important points in the entire conversation was Matej’s insistence that ad creative testing must be platform-native. Winning on Facebook does not guarantee winning anywhere else — and best practices for a/b testing ad creatives vary significantly by channel.
“Every channel really needs its own testing. If you have a true creative winner, it will likely perform across channels — but you can’t rely on that anymore. The channels behave very differently.”
Facebook creative testing demands a strong hook and strong mid-creative content. The end matters less than you think:
“You don’t necessarily need a strong call to action at the end, because 80% of people won’t finish even a one-minute video — best case, they watch half.”
AppLovin flips the script entirely:
“Everyone watches the whole video, and then they go into the playable. So you need a very strong call to action at the end. And the playable itself has best practices. On AppLovin, for example, players should be able to click at least 10 times and spend at least 60 seconds in the playable. So you’ve got one minute of video plus one minute of playable.”
TikTok is the most demanding channel when it comes to testing ad creatives at speed:
“Creative fatigue can hit in a matter of days, sometimes requiring new creatives every other day. TikTok even has a program where creators produce content for you. That’s potentially 40 new videos per week.”
“My setup there is to add a new ad group per campaign with around eight new videos and rotate continuously. Automation is key here. If you do it manually, you end up spending your entire day just uploading creatives and creating ad groups — and that’s it.”
Google keeps it simple — no separate testing campaign, just business-as-usual ad groups batched by concept, with up to 20 videos per ad group.
But the AppLovin creative sets represent a notable recent development worth flagging:
“AppLovin’s creative sets have been streamlined, so you can have a video with multiple playable variants in one creative set and let the algorithm handle the testing. This feature didn’t exist six months ago, so that’s a meaningful improvement.”
How to Test and Iterate Ad Creatives for Better ROAS: The Playable Layer
Playables add another layer of complexity to ad creative testing. They’re harder to produce, to test, and to iterate on. However, Matej’s approach reveals the process:
“You take a winning video, combine it with different playables, and test them against each other — same concept as video-only testing, but now with a playable end card. Whatever wins gets promoted to the next round. You always keep the winning playable as a control creative and test new combinations against it.”
The video and playable end card combination is consistently the best-performing format. But the production challenge is real and it’s exactly why Matej built Playable Maker:
“Playables are tough to produce and even tougher to test properly. But with templates, you can get there quickly. I found with André that midcore, hardcore, and 4X games are using a 3×3 jigsaw puzzle template — which sounds random, but it works. You can build this in 20 minutes: upload a static image, generate the playable, done.”
He also flagged an emerging trend that challenges the assumption that playables need to mirror actual gameplay:
“If there’s a highly scalable game out there — like a solitaire game or a match-3 — instead of cloning it, you take that game concept and use it as your playable. So you’re leveraging a proven mechanic in your creative without building an entirely new game. And it works.”
The Iteration Loop: Creative Testing Is Never Finished
Once you’ve found your winners, the work isn’t done. Knowing how to test and iterate ad creatives for better ROAS means accepting that iteration is ongoing and never really stops:
“After all the testing and finding your winners, the next step is iteration, and there’s no shortage of things to try: UGC, AI, sound, voiceovers, hooks, anxiety triggers, urgency. And then you start the whole cycle again. It’s a never-ending loop.”
Developing the intuition to navigate that loop requires sustained exposure to a wide variety of creatives. Matej illustrated the point with a sharp example:
“I had one game where we did two videos — one was ‘noob vs. pro’ — and the company wanted their logo and a headline in the first three seconds. I told them: nobody cares about that in the first three seconds. You need to grab attention. We lost 70% of viewers right there. So we tested it, trimmed the first three seconds on Facebook, and the version without the logo significantly outperformed. Unless it’s a Supercell logo, nobody cares.”
The Surprising Winner: Low-Fidelity Ads
Matej closed with what he described as his most genuine surprise in recent creative testing — and it’s a useful reminder that assumptions in this industry have a short shelf life:
“I’ve always said that low-fidelity ads — the ones that look like someone filmed themselves at home talking about a game tend to perform surprisingly well. I never really understood it at first. I’m used to producing polished content. But I started doing it, and it works. It doesn’t look like an ad. It feels like an influencer post or organic content. Very TikTok-style, very raw. That was genuinely surprising to me, and now it’s become a real part of what I produce.”
It’s a fitting note to end on. In a landscape where studios are producing 23,000 creatives a month and ad creative testing frameworks are evolving every six months, the creatives that cut through are often the ones that feel the least like creatives at all.
The Takeaway
In the end, successful ad creative testing isn’t about chasing volume for its own sake. It’s more about building a system that filters, validates, and iterates with purpose.
Whether you’re working with a handful of creatives or tens of thousands, the teams and creatives that win are the ones that stay disciplined: cutting what doesn’t resonate, testing against meaningful signals, and adapting to each platform’s unique behavior.
As the landscape continues to evolve with automation, AI, and new formats like playables, one principle remains constant: creative testing is not a one-time process, but a continuous loop of learning, refining, and scaling what truly works.
This article is based on Tenjin’s ROI 101 podcast episode featuring Matej from Two and a Half Gamers. Watch the full episode and explore Matej’s Creative Bible and Playable Maker via the links in the episode description.
Менеджер по маркетинговому контенту
Тара Мейер