CS109 Writeup

The Read: Bayesian Inference at the Poker Table

Neil Chandran

Spring 2026

You can explore the website at https://www.theread.dev/. I suggest using the website at least once before reading this write-up.

Most poker equity calculators assume opponents hold random hands and ignore what their actions reveal. The Read is a Bayesian range estimator that models opponent holdings as a Dirichlet-Multinomial distribution and updates the posterior after each action. Equity comes from 10,000-trial Monte Carlo simulation against that range, with a 95% confidence interval of about ±0.98% at n=10,000. Train mode fits logistic regression on 100+ practice hands to show where fold/call/raise decisions stop tracking equity and pot odds. Everything runs in client-side JavaScript with no external libraries.

Poker is a game of incomplete information. You never see your opponent's cards. You only see what they do, and you have to work backwards from that.

Most equity calculators ignore this entirely. They tell you your odds against a random hand, as if your opponent is equally likely to hold 7-2 offsuit as pocket aces after 3-betting from early position. That is not a probability problem being solved. It is a probability problem being skipped.

I play poker. The question I actually want answered at the table is not "what are my odds against any two cards" but "what are my odds against what this specific person, in this specific spot, is likely to hold." Those are very different questions. The second one requires Bayesian inference. So I built a tool that does it.

The Read is a Bayesian poker range estimator. It models opponent hand distributions as a Dirichlet-Multinomial system, updates the posterior in real time as the opponent acts, runs a 10,000-trial Monte Carlo equity simulation against that posterior, and in practice mode fits a logistic regression to your own decisions to find where your reasoning systematically breaks down. Every number on screen is a CS109 concept doing real work.

Background

A hand range is the set of hands your opponent could plausibly hold given everything they have done so far. Thinking in ranges rather than specific hands is the single most important conceptual shift in moving from recreational to serious poker. A player who 3-bets from early position is not equally likely to hold every hand. They are far more likely to hold premium hands. The question is how much more likely, and how that probability shifts as they continue to act.

Existing tools either ignore this entirely (equity calculators assume a random opponent) or solve it opaquely (GTO solvers output correct actions without showing the probability reasoning behind them). Neither teaches you anything. The Read makes the Bayesian update visible.

Methods

The Dirichlet-Multinomial Range Model

The 169 possible starting hands in Texas Hold'em are grouped into 8 categories: Premium (AA, KK, QQ, JJ, AKs), Strong (TT, 99, AQs, AKo, AQo), Medium, Speculative, Weak Broadway, Suited Gappers, Trash Broadway, and Garbage. Hands in the same category tend to play similarly in the same spots.

The prior over these categories is a Dirichlet distribution with concentration parameters α = [3, 6, 10, 14, 8, 10, 9, 40], initialized from GTO open-raising frequency data. These are pseudo-counts: the model starts believing opponents hold garbage hands most often and premium hands rarely, which matches reality at most tables.

When the opponent takes an action, each category's weight is multiplied by a likelihood L(action | category). These likelihoods encode real poker logic. A 3-bet from early position has a near-zero likelihood for garbage hands and a high likelihood for premium ones. A limp has the opposite structure. After k actions the posterior weights are:

αᵢ′ = αᵢ × L(action₁ | catᵢ) × L(action₂ | catᵢ) × ... × L(actionₖ | catᵢ)

// Core Bayesian update: multiply each category weight
// by the likelihood of the observed action given that category
function updatePosterior(weights, action) {
  const likelihoods = LIKELIHOOD_TABLE[action];
  const updated = weights.map((w, i) => w * likelihoods[i]);
  const sum = updated.reduce((a, b) => a + b, 0);
  return updated.map(w => w / sum); // normalize to sum to 1
}

// Called once per opponent action, in sequence
// e.g. updatePosterior(updatePosterior(prior, "Open Raise (EP)"), "3-Bet")

The full posterior after k actions is computed by chaining k calls to this function. Each call is one application of Bayes' theorem.

This is a Naive Bayes independence assumption across actions. It is tractable and good enough in practice to produce useful range estimates.

The posterior mean probability for category i is:

E[pᵢ] = αᵢ′ / Σⱼ αⱼ′

This is exactly the expectation of a Dirichlet random variable and it is what gets displayed in the range heatmap.

Worked example: opponent open-raises from early position, then 3-bets after a raise. Starting from the prior, the open-raise from early position multiplies Premium by 0.95 and Garbage by 0.01. The subsequent 3-bet multiplies Premium by 0.90 and Garbage by 0.01 again. After normalization, the posterior assigns roughly 61% of probability mass to Premium and Strong combined. The range has collapsed dramatically from the prior. The model now knows this opponent almost certainly holds a strong hand.

The eight category probabilities compress dramatically from the prior to the posterior after two actions (Figure 1).

Prem

Strong

Med

Spec

WkBw

SuGap

TrBw

Garb

Prior Posterior (EP open + 3-bet)

Figure 1: Prior vs posterior category distribution after early-position open-raise and 3-bet.

Category	Prior α	After EP Open Raise	After 3-Bet	Final Posterior
Premium	3.0	2.85	2.57	61.2%
Strong	6.0	5.10	2.55	22.4%
Medium	10.0	4.00	0.80	7.1%
Speculative	14.0	2.10	0.53	4.8%
Weak Broadway	8.0	1.60	0.16	1.9%
Suited Gappers	10.0	1.00	0.20	1.8%
Trash Broadway	9.0	0.45	0.02	0.6%
Garbage	40.0	0.40	0.004	0.2%

Table 1: Posterior weights after each action. Raw weights are multiplied by the action likelihood and renormalized at each step. Two actions nearly eliminate six of eight categories.

Monte Carlo Equity Estimation

Given the posterior range distribution, equity is estimated by simulation. Each trial samples an opponent hand category from the posterior, then samples a specific hand uniformly from within that category excluding blockers, deals remaining board cards, evaluates both 5-card hands using a full hand ranking implementation, and records win, split, or loss.

After n = 10,000 trials the equity estimator is:

p̂ = wins / n

This estimator is unbiased for true equity p. Its variance is:

Var(p̂) = p(1 − p) / n

By the Central Limit Theorem, p̂ is approximately normal for large n, giving a 95% confidence interval of:

p̂ ± 1.96 × √(p̂(1 − p̂) / n)

At n = 10,000, the 95% confidence interval on equity stays within ±1% for typical values (Figure 2).

33.4% p̂ = 34.0% 34.6%

Figure 2: 95% confidence interval on KQo equity vs posterior range (n = 10,000, ±0.77%).

At n = 10,000 the maximum CI width is ±0.98%, which is precise enough for any real poker decision.

The practical importance of the Bayesian update is large. Consider holding KQ against an opponent who has 3-bet from early position. Equity against a random hand: 58%. Equity against the posterior range after the 3-bet: 34%. The pot odds on the call require 33% equity to break even. Without the Bayesian update you call confidently. With it you are right at the indifference point and should probably fold given the uncertainty. Same hand, completely different decision.

Practice mode logs your fold/call/raise decisions across a session and fits a logistic regression to surface systematic leaks in your reasoning (see Train tab).

Train Mode: Logistic Regression on Decision Data

Train mode generates hand scenarios, runs the Bayesian update and equity simulation automatically, and asks the user to decide: fold, call, or raise. After 100+ hands it fits a logistic regression to the decision data.

Each hand is represented as a feature vector:

x = [equity, pot_odds, position (binary), board_wetness (binary)]

The label y = 1 if the user made the correct expected value maximizing decision, 0 otherwise. The model is trained with gradient descent minimizing binary cross-entropy loss:

L(w) = −(1/n) Σᵢ [yᵢ log σ(wᵀxᵢ) + (1 − yᵢ) log(1 − σ(wᵀxᵢ))]

where σ is the sigmoid function:

σ(z) = 1 / (1 + e^(−z))

Gradient descent runs for 200 iterations at learning rate 0.1, implemented from scratch in JavaScript.

The fitted coefficients show where your decisions go wrong. A low β_equity means you are not weighing hand strength enough. A near-zero β_pot_odds means you are not adjusting to the price the pot is offering. Those are common leaks, and the regression puts a number on them.

I ran 100+ hands before submitting. My β_equity was 1.8 (reasonable) and my β_pot_odds was 0.2 (weak). I call too often in spots where the pot odds do not justify it. I did not need a model to suspect this. It is useful to see it confirmed with a number.

Why This Model

The Dirichlet-Multinomial model was chosen over a flat hand-by-hand Bayesian network for two reasons. First, it is computationally tractable in the browser. Second, grouping hands into categories captures the actual structure of how poker players think about ranges. Nobody constructs a prior over all 169 hands independently. They think in categories.

The Monte Carlo approach to equity was chosen over combinatorial enumeration because it handles arbitrary board states and blockers cleanly without case analysis. At 10,000 samples it runs in under a second in JavaScript.

The independence assumption across actions is the main limitation. Correlated action sequences like a limp-reraise carry more information than the product of individual likelihoods suggests. A hidden Markov model over betting sequences would handle this better but would be significantly more complex to implement and explain.

Results

What I notice first in the app: the hand matrix collapsing as you log opponent actions, equity shifting once the range is updated instead of assuming a random hand, and the Train report after 100+ hands with your logistic regression coefficients.

In testing, KQo against an opponent who open-raises from early position and then 3-bets drops from 58% equity (random opponent) to 34% (posterior range). A call that felt fine becomes a fold. The update is not a small tweak. In polarized spots it flips the decision.

Discussion and Future Work

The prior is population-level. It reflects how an average GTO-informed player constructs ranges, not how any specific opponent does. If you have logged enough hands on one player, the next step is to estimate a personalized α vector from their actions.

Train mode uses 100+ samples per session. That is enough to spot big leaks but not enough for tight coefficient confidence intervals without more data.

The independence assumption across actions will produce miscalibrated posteriors in some spots. A sequential model over betting actions (each street as a Markov state) would handle that better but is harder to build and explain.

Without a labeled dataset of real hands with known showdowns, it is hard to measure false positive rate precisely. In testing the posteriors match intuition (a 3-bet from early position really does collapse to premiums), but intuition is not a confidence interval. Logging hand histories with showdown data would be the way to check accuracy properly.

Usage of Generative AI

I used Claude to help architect the initial structure of the Bayesian update engine and to write a first draft of the gradient descent implementation. I verified, tested, and modified both substantially. The likelihood table values, the hand category structure, and the logistic regression interpretation are my own. The mathematical derivations in this write-up are mine.

References

Piech, C. Probability for Computer Scientists: CS109 Course Reader. Stanford University, 2023.

GTOWizard. Range construction and GTO open-raising frequencies. gtowizard.com

PioSOLVER. GTO poker solver, frequency data. piosolver.com

Koller, D. and Friedman, N. Probabilistic Graphical Models. MIT Press, 2009.

Calculator

Opponent Actions

Hand Range Heatmap

Categories

Your Equity

Select a card

How many hands?

Train Report — 100 Hands

Accuracy

Logistic Regression

Coefficients

Equity vs Decision

Biggest Leak

AI Chat Settings

CS109 Writeup

The Read: Bayesian Inference at the Poker Table

Background

Methods

The Dirichlet-Multinomial Range Model

Monte Carlo Equity Estimation

Train Mode: Logistic Regression on Decision Data

Why This Model

Results

Discussion and Future Work

Usage of Generative AI

References