Week 43 — Chess AI · From Zero to AI Hero

Pure Minimax on chess is hopeless. The branching factor is roughly 35 (typical number of legal moves per position), so depth-N search is 35^N positions. At depth 6 that's 1.8 billion positions — minutes per move. At depth 8: 2 trillion. Forever. We need two ideas. First: a way to skip provably-irrelevant work (alpha-beta pruning). Second: a way to stop searching at a fixed depth and call the position "good enough" with a heuristic (the evaluation function). Together they make a chess engine practical.

The evaluation function

At a leaf of our search tree (depth 0, or game over), we need a number: how good is this position for the side to move? For chess, the bedrock is material — the relative value of pieces. Pawn = 1, Knight = 3, Bishop = 3.25, Rook = 5, Queen = 9, King = ∞ (untradeable).

int evaluate(const Board& b) {
    // returns score in centipawns from white's perspective.
    int score = 0;
    for (int f = 0; f < 8; f++)
        for (int r = 0; r < 8; r++) {
            auto* p = b.at({f,r});
            if (!p) continue;
            int v = piece_value(p->letter());      // in centipawns: P=100, N=320, B=330, R=500, Q=900, K=20000
            v += pst_bonus(p->letter(), p->color(), {f,r});      // bonus for good squares
            score += (p->color() == Color::White) ? v : -v;
        }
    return score;
}

"Centipawns" — hundredths of a pawn — gives precision without floats. The piece-square table (PST) adds a small bonus for being on a good square: knights on the rim are bad, central pawns are good, the king should be tucked behind pawns in the middlegame and active in the endgame. The PST is a tiny 8×8 table per piece type, a few hundred bytes, and it's responsible for most of the qualitative jump from "random" to "looks like it's playing chess."

Alpha-beta pruning

The insight: if while exploring a move you discover its score is so bad that the parent (who's a Maximiser) wouldn't pick it anyway, stop searching its remaining children. The score that's already there proves the parent will reject this branch; further detail is wasted work.

Two parameters threaded through the recursion track this: alpha = best score Maximiser is guaranteed so far, beta = best score Minimiser is guaranteed so far. Whenever alpha ≥ beta, the branch is pruned.

int alphabeta(Board& b, int depth, int alpha, int beta) {
    if (depth == 0) return evaluate(b);

    auto moves = b.all_legal_moves();
    if (moves.empty()) {
        return b.in_check(b.side_to_move())
            ? -100000 // being mated is terrible
            : 0;           // stalemate
    }

    int best = -200000;
    for (auto& m : moves) {
        auto u = b.make_move(m);
        int score = -alphabeta(b, depth - 1, -beta, -alpha);
        b.unmake_move(m, std::move(u));
        if (score > best) best = score;
        if (best > alpha) alpha = best;
        if (alpha >= beta) break;     // β-cutoff — pruned
    }
    return best;
}

This is negamax, a tidy single-function form of Minimax that exploits chess's symmetry: instead of a Maximiser-vs-Minimiser dichotomy, both sides maximise from their own viewpoint and we negate the score across the recursive call. Same algorithm, fewer special cases.

Alpha-beta is provably equivalent to Minimax — same answer, same move chosen — but with good move ordering it explores roughly √(branching) of what plain Minimax does. For chess that's the difference between depth 4 and depth 8 in the same wall-clock time.

Move ordering: the secret sauce

The pruning is most effective when good moves come first. If you happen to search the best move first, every subsequent move's lower-bound is already high and they all get cut quickly. If you happen to search the worst move first, you do the most work for the least useful information.

Cheap heuristics that order moves before search:

Captures first, sorted by MVV-LVA (Most Valuable Victim, Least Valuable Attacker) — taking a queen with a pawn is almost always great.
Killer moves — moves that caused a beta cutoff at this depth in a sibling subtree. Frequently good in this one too.
History heuristic — keep a table of (from, to) → "this move has caused cutoffs N times" and sort the rest by that.

Even rough ordering (just "captures first") is enough to get most of the gain.

Iterative deepening

Don't search to a fixed depth — search to depth 1, then 2, then 3, …, until time runs out. The previous iteration's best move becomes the first move tried in the next one, which makes pruning incredibly effective. And if the search is interrupted (you hit your time limit mid-iteration), you fall back to the last completed depth's best move.

Move best_move(Board& b, int time_ms) {
    Move best;
    auto deadline = now() + ms(time_ms);
    for (int d = 1; now() < deadline; d++) {
        auto m = search_at_depth(b, d, deadline);
        if (m) best = m;     // keep last completed depth's pick
    }
    return best;
}

It looks wasteful — repeating shallow searches before going deeper — but the previous iteration costs roughly 1/35 of the next, so the overhead is <5%, and the move-ordering benefit on the deeper search is enormous.

Quiescence: don't stop in the middle of a fight

Stopping at a fixed depth has one nasty failure mode: you stop right before a recapture. The eval at that frontier sees "I just won a knight!" without seeing "and lost my queen on the next ply." This is the horizon effect.

The fix: at depth 0, don't immediately return. Instead, recurse only on captures until the position is "quiet" (no captures available). This bounded extension cleans up tactical noise at the leaves without significantly increasing search cost.

Transposition tables

Chess has many ways to reach the same position (move 1.Nf3 d5 2.d4 ↔ 1.d4 d5 2.Nf3). A transposition table is a hash map from position → previously computed (score, depth, best move). When you reach a position you've seen at sufficient depth, you reuse the answer.

The hash itself is computed with Zobrist hashing: XOR together a random 64-bit constant for each (piece, square) pair plus side-to-move plus castling rights. make_move incrementally XORs out the changed pieces and XORs in the new ones — updating the hash in O(1).

For our hobby engine, a few hundred MB of transposition table is enough to push effective depth from ~6 to ~9, which is a step up in playing strength visible to a casual human.

Putting it together: how strong?

A clean implementation of: alpha-beta + iterative deepening + move ordering (captures first) + quiescence + a small transposition table + a material+PST evaluator. Maybe 800 lines of C++ on top of last week's engine. Search depth ~7-8 ply in a few seconds per move. Approximate Elo: 1800–2000. That beats most casual humans, and it's a straight line of identifiable, well-known techniques. Each one would have made you a chess-engine pioneer in 1965.

Sixty years of chess engineering has been roughly: search deeper, prune more, evaluate better. We did all three this week.

The AlphaZero leap

What we just built is the architecture of every classical chess engine from 1958 (Bernstein's IBM 704) to 2017 (Stockfish 8). In 2017, AlphaZero replaced the hand-crafted evaluation function with a deep neural network trained by self-play, and replaced fixed-depth alpha-beta with Monte Carlo tree search guided by the network. The idea of tree search of legal moves is unchanged. What changed is that the static evaluator went from "P=100, N=320, central pawns are good" to a 40-million-parameter neural network trained to predict the outcome from the position.

Phase 7 talks about why a transformer is, structurally, doing tree search over its own attention head. The pieces all rhyme.

Try it yourself

Implement evaluate — material only first. Run a 4-ply alpha-beta search. Watch the engine play. It'll grab pieces but lose them on the next move (horizon effect).
Add quiescence search. The blunders mostly stop.
Add piece-square tables. The engine starts looking like it's playing chess (knights jump in, rooks find open files).
Add iterative deepening with a 5-second move budget. Now play it a real game. Don't expect to win.
For fun: count nodes searched per second. A clean implementation should hit 0.5–2 million nodes/sec on a single core.

What's next

Phase 5 is over. Three games, three AIs, all running in a terminal. Now they need a face. Phase 6 covers the bridge between C++ and the platform-native UI layers: Swift on Apple, C# on Windows. The same chess engine, the same minimax tree, will drive a proper macOS app and a proper Windows app — without changing a single line of the engine itself.

Week 44 is The Bridge Header — how C++ talks to Swift and C#.

Quick check

1. Alpha-beta pruning is…

An approximation of minimax — sometimes wrong
Provably equivalent to minimax — same answer, just skips provably-irrelevant subtrees
A neural network

Reveal Answer

Answer: B. Same chosen move, fewer nodes searched. With good move ordering, the search tree shrinks to about √(branching factor) of plain minimax.

2. Without a quiescence search, fixed-depth alpha-beta exhibits the…

Off-by-one error
Horizon effect — stopping mid-fight and missing the recapture
Rubber-band problem

Reveal Answer

Answer: B. Quiescence extends the search through captures only, until the position is 'quiet'. Cleans up tactical noise at the leaves.

3. What does a transposition table store?

Random board positions
Previously analysed positions, indexed by Zobrist hash, so the engine doesn't re-search them
The opening book

Reveal Answer

Answer: B. Many move orders reach the same position. A transposition table memoises the answer and saves enormous work.