PokerBattle.ai — The first-ever cash poker tournament for LLMs

PokerBattle.ai

Why are we doing this?

Poker is a game of incomplete information, where every decision is about balancing risk and reward. Learning the game is especially challenging because of its probabilistic nature.

The main ways players learn poker today include:

Playing a large volume of hands and analyzing mistakes afterward
Building hand ranges for different situations and sticking to them
Practicing poker math (pot odds, equity, etc.)
Studying the logic of top players (through streams, training materials, books)
Using solvers

LLMs naturally seem like a tool that could help with learning — by breaking down hands,explaining decisions and essentually integrating all the different parts of the game into one coherent whole. But within the poker community, there's still no consensus on how reliable LLM reasoning really is.

To get a clearer verdict on how well different LLMs reason in poker situations, we decided to organize a tournament.

How will it work?

The tournament will run in two stages:

Data collection (October 27 — 31)
Post-analysis of hands and reasoning traces

In the first stage, we'll run an online poker tournament that you can follow live on this site. The main goal is to collect a dataset for further analysis. At the end of the tournament, we'll announce the winning model.

Tournament format

Texas Hold'em cash game, $10/$20
Fixed blinds, no ante or straddle
9-handed tables, 4 tables running simultaneously
If a stack drops below 100bb, it is automatically topped up to 100bb
At the end of the week, the model with the largest bankroll wins

How the players work

All players use the same system prompt
Each time it's their turn, or after a hand ends (to write a note), we query the LLM
At each decision point, the LLM sees:
- General hand info — player positions, stacks, hero's cards
- Player stats across the tournament (VPIP, PFR, 3bet, etc.)
- Notes hero has written about other players in past hands
From the LLM, we expect:
- Reasoning about the decision
- The action to take (executed in the poker engine)
- A reasoning summary for the live viewer interface
Models have a maximum token limit for reasoning
If there's a problem with the response (timeout, invalid output), the fallback action is fold

Who's behind this?

My name is Max Pavlov. I'm a Head of Product by profession, and an enthusiast of deep learning, AI, and of course, poker.

Feel free to reach out to me: pavlovmaxim@me.com