From A/B Testing to Real-Time Player Personalization!

The answer to the question “is A or B better for my players?” is often “it depends on the players, some like A, others prefer B!”. Until now, game designers had to use A/B Testing tools to select the “best choice” for their overall audience. But today, new AI technologies like machine learning give game designers the possibility to offer in real time A to the players who prefer A and B to the players who prefer B, providing a better experience to all players.

The key steps in using A/B testing.

A/B testing is a useful tool to help game designers and developers make UX, level design or monetization design decisions, based on real players’ behaviour. For instance, should you put the “save me” button in color A or B in order to have more people use this feature?

Setup an A/B testing solution so some players will see the button in color A, other players in color B, and in the A/B testing tool portal, you have to setup some events and reports to display some metrics for both cohort, so you can make a decision (for instance percentage of players clicking the “save me” button when it is made available).

This can be a lot of work for a result that is not easy to interpret (16% click-rate with color A, 18% with color B, with a 4% error margin…?).

An example of A/B testing for difficulty tuning

Now let’s say that you tuned the difficulty of a specific stage in your game to a “default” value, based on your tests, tests from people in your team, or of some external testers. But you would like to know if, for your real players, a difficulty easier or harder would be better for them.

What does “better for the players” mean? ” You have to define your objective: more people winning the stage? Having better retention? Having better monetization? Converting more players to payers? Having longer sessions?

Then, using an A/B testing solution, you setup 3 cohorts (receiving the easier, the default and the harder difficulties), you code these changes in your game (when a player is starting the specific stage) based on the cohort, you submit the game update and after a few days of play (or a few weeks) you look at the metrics related to you “objective”, for instance Day7 retention.

The difficulty to act upon A/B testing

It might be difficult to have results that you can act upon. Let’s suppose that your “default” difficulty is better for 40% of your players, the “easier” for 30% of players and “harder” for the remaining 30% of players:

  • Cohort “default” should show a Day7 retention a bit better than the 2 other cohorts
  • Cohorts “easier” and “harder” will be a bit lower but should both give almost the same Day7 retention

So after all this work, you keep your “default” difficulty…pleasing only 40% of players…

And what if you want to tune the difficulty of all your stages? Lots of work!

Using Machine Learning for difficulty tuning

Let’s see how we can work with Data Science to improve retention through difficulty tuning.

Like in A/B testing, you code the difficulty changes in your game (when a player is starting a stage), based on 3 “values”:  “easier”, “default”, “harder”, you submit your game, and that’s it!

When the game is live, events are sent to the platform about the players’ sessions and behaviors (wins, losses…)  for all stages. After a few days of exploration (for each stage played, players receive “easier”, default” or “harder”), the platform learns from these players (builds a Machine Learning model), with a “reward” (your objective in A/B testing), for instance increasing the retention.

Once the platform has learnt from the players, every time a player starts a stage, the game sends a request to the platform, which answers in real time a personalized value (“easier”, “default” or “harder”), in order to maximize the reward (increase retention) for the current player.

Going beyond A/B testing: the benefits of real-time personalization

The first benefit is time saving for your team: you do not have to do all the work of looking at reports and trying to figure out the best tuning.

The second benefit is improved players’ experience: you deliver to all players the tuning that suits them best! This will lead to better retention and monetization.

In the example above, 40% of the players should receive “default”, 30% should receive “easier” and the remaining 30% should receive “harder”. Something impossible to get with A/B testing.

As you can see, A/B testing is a method that can take a lot of time to setup and use, for very little results in mobile games. On the other hand, Machine Learning can bring huge improvements, being not only much simpler to use, but also going much further by providing real-time player personalization.

As a summary:

A/B Testing ML Real-Time Personalization
Studio programs in game the effect of each option (for instance, difficulty of a stage set to easier, harder or default). Studio programs in game the effect of each option (for instance, difficulty of a stage set to easier, harder or default).
Studio defines in the A/B testing portal an audience group that will receive the values. Studio defines a % of all players that are used to learn on (“explored players”).
Studio has to define or select a “goal metric” (objective) to measure the best results. The platform manages its own goal (for instance “increase time spent in session” for a difficulty tuning solution).
A player in a cohort will always receive one option. The platform can explore any option on the players, dynamically.
After several days of live play, the studio has to check in a portal the effect of each option on the desired objective (might need to create special reports). The platform will automatically explore and learn from the players.
The studio has to choose one option, that might not be the best one for all players, but all players will receive this option. The platform will indicate to the game in real-time which option is best for the current player in the game.