Building Minesweeper with Claude CLI Part 1: Planning

I’ve just closed my laptop after a full day of meetings — roadmap reviews, architecture decisions, a particularly spirited debate over boundaries of our capabilities. The kind of day where you talk little about code and write even less of it.

I’ve spent over twenty-five years in this industry. I started writing code when hard drives were measured in megabytes (my brother yelled at me for consuming couple of kilobytes of memory on his brand new Seagate hard drive on his 286), when “the web” meant blinking text on Netscape, and when the height of programmer ambition was getting your *.bas program to draw a circle on screen (ah, GW-BASIC). I’ve lived through the dot-com boom and bust, the rise of mobile, the cloud revolution, and the microservices mania. I’ve hired engineers at every level, led teams on three continents, and reviewed more pull requests than I care to remember.

Then something shifted about 2 years ago. In the last couple of years, the AI claims started being backed up by things that actually worked. Not perfectly. Not magically. But substantively. Enough to make me genuinely curious. I no longer saw LLM as glorified typeahead on steroids.

So, I decided to run an experiment. Not a benchmark, not a vendor evaluation – a real experiment, the kind where you sit down with a blank directory and an open mind and see what happens.

The task: build a web-based Minesweeper game from scratch, using Claude Code as my primary thinking partner. The rules I set for myself: apply real software engineering discipline (SOLID principles, clean code practices, test-driven development), don’t hand-hold, and don’t edit Claude’s output to make it look better in the blog. Whatever comes out, comes out.

Minesweeper felt like the perfect test subject. It’s small enough to finish in a sitting, yet rich enough to demand real decisions: game state management, a recursive-style algorithm, component boundaries, timing behaviour, and the subtle UX detail that the first click must never kill you.

This first post is the planning session. Part 2 is the build. Let’s start.


Starting the session

$ claude

This single command still gives me a small pause every time I type it. I started my career on command prompt. After years of fancy IDE with its bells and whistles, we’re back to square one – a terminal, historically the most solitary of developer tools. Separately, I’m old enough to remember when “pair programming” required two chairs and a shared keyboard. Now, one of the pair is a language model running on someone’s data center.

Claude Code drops you into an interactive chat that can read, write, and run shell commands inside your project directory. I started with a blank directory and a single broad question. No scaffolding, no hints. Just the question I’d ask one of my engineering lead on their first day.


Prompt 1 – Kick off the design discussion

> I want to build a web-based Minesweeper game in ReactJS.
  Before we write any code, let's plan the design.
  Apply SOLID design principles and assume we'll want high unit-test coverage.
  Give me a component breakdown, a data model, and the high-level flow.

Claude:

Great starting point. Here is how I would structure it.

Data model
Every cell on the board is a plan object
{
isMine: boolean, // true if this cell hides a mine
isRevealed: boolean, // true once the player has opened it
isFlagged: boolean, // true when the player has marked it with 🚩
adjacentMines: number, // count of mines in the 8 surrounding cells
}

The board itself is a 2-D array: Cell[][].

Game status is one of four literals — idle (before the first click), playingwon, or lost.

Component tree
App
├── GameControls — difficulty selector + "New Game" button
├── GameStatus — mine counter · emoji face · elapsed timer
└── GameBoard
└── Cell × N — one per board position

Layer separation (SRP / DIP)
constants/gameConfig.js ← pure config, no logic
engine/boardFactory.js ← pure functions: create, seed, count adjacency
engine/gameEngine.js ← pure functions: reveal, flag, win-check
hooks/useGameState.js ← React state, timer, orchestration
components/ ← pure UI — no game logic

Game flow
IDLE ──first click──► PLAYING ──all safe cells revealed──► WON
└──mine clicked──────────────────────► LOST

I’ve conducted somewhere north of three hundred technical interviews in my career. One of my favorite questions is deceptively simple. It helps me understand how the candidate things through problems. The purpose is not to test their knowledge of the subject but to test their instinct for layering.

The good answers start with separation of concerns. Game logic belongs somewhere that knows nothing about rendering. State coordination belongs somewhere that knows nothing about game rules. UI components belong somewhere that knows nothing about where the state came from. Getting that instinct takes years for many software engineers. Some never get it at all.

Claude got it on the first prompt, unprompted.

The four-layer model it produced (constants, engine, hook, components) is exactly what I’d draw on a whiteboard when coaching a mid-level engineer on how to think about separation of concerns and layering. The decision to split boardFactory from gameEngine is the right approach: one owns the construction of game state, the other owns the transformation of game state. Separate responsibilities, separately testable. That’s not obvious. That’s the kind of thing that comes from experience with code that became unmaintainable because those two concerns were tightly coupled.

I made a note: the four-state game status (idle / playing / won / lost) is the right model. Not three states, not a boolean. idle is important — it’s the state where mines haven’t been placed yet, which enables the safe-first-click guarantee we’ll address shortly.


Prompt 2 – Validate SOLID compliance

> Walk me through how each SOLID principle is satisfied in this design.

Claude:

Single Responsibility Each module owns exactly one concern:

- boardFactory creates and initialises board state.
- gameEngine enforces game rules (reveal, flag, win detection).
- useGameState wires the two engines to React state and the timer.
- Each component renders one UI region and nothing more.

Open/Closed Difficulty levels live in constants/gameConfig.js as a plain object:

export const DIFFICULTIES = {
BEGINNER: { rows: 9, cols: 9, mines: 10, label: 'Beginner' },
INTERMEDIATE: { rows: 16, cols: 16, mines: 40, label: 'Intermediate' },
EXPERT: { rows: 16, cols: 30, mines: 99, label: 'Expert' },
};

Adding a new difficulty requires touching only this config — zero changes to components or engine code.

Liskov Substitution Both engine modules expose consistent pure-function signatures: (board, ...args) → board. Any function that accepts one can be replaced with another without breaking callers.

Interface Segregation Each component receives only the props it needs. Cell never sees minesLeftGameStatus never sees the board array.

Dependency Inversion UI components depend on the hook abstraction (useGameState) not on engine internals. Engine functions depend on plain JavaScript arrays, not on React or any framework.

Here’s where I had to stop and take a breath.

SOLID gets talked about in job descriptions and conference talks and senior engineering forums with great regularity, but is applied far less. The Open/Closed Principle in particular is one that’s usually described correctly and implemented poorly – developers know they should “close for modification,” but when a new requirement comes in, they often find themselves adding another if statement in the middle of existing code instead of reaching for a configuration object or a strategy pattern.

What Claude produced here is a textbook application of OCP: difficulty lives in config, not in conditional logic. If someone wants to add a “Nightmare” mode with a 30×50 grid, they add one object entry. Not one object entry plus one class, plus a factory, plus a registration mechanism. Just the entry. Right-sized abstraction for a right-sized problem

The Interface Segregation point is subtle and correct. Cell doesn’t need to know how many mines are left; it only needs to know about its own state. GameStatus doesn’t need the board array; it just needs a number and a status string. Giving components exactly what they need and nothing more is one of those things that seems obvious in retrospect and gets violated constantly in practice — especially as components accrete requirements over time and the prop lists balloon.

I’ve reviewed production React codebases where a component three levels deep in the tree was receiving fifteen props, twelve of which it was immediately passing to its children unchanged. That’s not a component, that’s a hallway. The interface segregation principle is what prevents hallways.


Prompt 3 – First-click safety

> Minesweeper traditionally guarantees the first click is always safe.
  How should we handle that in a way that's still testable?

Claude:

Keep mine placement deferred — the board starts empty and mines are seeded only on the first click, with the clicked cell (and its 3×3 neighborhood) excluded from candidate positions.

IDLE board: no mines, all cells hidden

first click at (r, c)

placeMines(board, count, r, c)
│ — builds a safe-zone Set around (r, c)
│ — randomly assigns mines outside the safe zone
│ — recalculates adjacency counts

PLAYING board: mines present, first cell revealed

Because placeMines is a pure function it's trivially testable:

// Test: no mine lands inside the 3×3 safe zone
const board = placeMines(createEmptyBoard(9, 9), 10, 4, 4);
for (let dr = -1; dr <= 1; dr++)
for (let dc = -1; dc <= 1; dc++)
expect(board[4 + dr][4 + dc].isMine).toBe(false);

The first-click safety guarantee is one of my favorite small details in game design. It’s the kind of thing that most players never consciously notice but immediately feel when it’s absent – that moment of clicking a blank area of the board and instantly dying, before you’ve even had a chance to think, is genuinely demoralizing. The original Windows Minesweeper in its early versions didn’t guarantee this. Versions of the game that do feel measurably more fair.

Recognizing that this is a UX detail worth preserving – and then immediately framing its solution in terms of testability – is the response of someone who thought about both the user experience and the engineering simultaneously. The deferred mine placement idea is elegant: the board exists in an IDLE state with no mines at all, and only commits to a mine layout on the first click, with the clicked cell and its eight neighbors guaranteed safe. You can’t lose on your first move. The universe is just.

I particularly liked the instinct to make placeMines a pure function. Functions that depend only on their inputs and produce no side effects are the easiest things in all of software to test. You call them with known inputs, you check the outputs, and you’re done. No mocks, no setup, no teardown, no waiting for timers. Pure functions are the closest thing software engineering has to a free lunch.


Prompt 4 – Flood-fill reveal

> Classic Minesweeper auto-reveals empty regions when you click an empty cell.
  What algorithm should we use and how do we keep it pure and testable?

Claude:

Breadth-first search (BFS) — it is iterative, avoids call-stack overflow on large boards, and processes cells in the natural order the player would expect (expanding outward).

revealCell(board, row, col):
  1. Clone the board (immutability).
  2. Mark (row, col) as revealed.
  3. If adjacentMines === 0: push (row, col) onto a queue.
  4. While queue is not empty:
       dequeue (r, c)
       for each of the 8 neighbours (nr, nc):
         if in-bounds AND not revealed AND not flagged AND not a mine:
           reveal (nr, nc)
           if (nr, nc).adjacentMines === 0: enqueue (nr, nc)
  5. Return the new board.

Purity is preserved because we clone before mutating. Testing edge cases:
| Scenario                        | Expected outcome                             |
|---------------------------------|----------------------------------------------|
| Click a numbered cell           | Only that cell is revealed                   |
| Click an empty cell in a corner | All reachable empty + their numbered borders |
| Click next to the only mine     | One cell revealed, mine untouched            |
| Flagged cell in flood path      | Flood stops at flag                          |

Very early in my career, I implemented Minesweeper Java Applet as a weekend project in Java AWT. I used recursion for the flood fill. It worked beautifully on small boards and crashed spectacularly on large ones – a StackOverflowError that was easily diagnose based on my naive approach. I had to rewrite it iteratively – mistake I learn to avoid in future.

BFS versus DFS versus recursion for this problem is exactly the kind of question that comes up in algorithm discussions, and the reasoning matters: recursion is elegant but stack-constrained. DFS works but reveals cells in an order that doesn’t match the player’s spatial intuition. BFS expands outward, which is how a person would naturally imagine the reveal propagating. It’s the right answer for the right reason — not just correctness, but user-perceptible quality.

What caught my eye here is that table of edge cases. That’s not documentation – that’s a test specification. Someone reading that table and not immediately thinking “I should write a test for each of these rows” isn’t thinking like a software engineer. Someone who produces that table unbidden is already thinking like a thorough one. I’ve reviewed PRs where the reviewer had to argue for every edge case test. Here they appeared without asking.


Prompt 5 – Timer design

> The status bar needs an elapsed-seconds timer.
  Where should that live and how do we avoid timer leaks?

Claude:

The timer is a game-state concern, not a UI concern — it lives in useGameState.

const timerRef = useRef(null);

useEffect(() => {
  if (gameStatus === GAME_STATUS.PLAYING) {
    timerRef.current = setInterval(() => setTime(t => t + 1), 1000);
  } else {
    clearInterval(timerRef.current);
  }
  return () => clearInterval(timerRef.current);   // cleanup on unmount
}, [gameStatus]);


Rules:
- Timer starts only on transition to PLAYING (first click).
- Timer stops automatically on WON or LOST.
- The useEffect cleanup prevents leaks when the component unmounts mid-game.
- GameStatus receives time as a prop and is a pure presentational component — it cannot influence the timer.

The instinct to put the timer in the component is almost universal among junior developers, even some seniors, who are still building their mental model of ReactJS. It feels natural – the timer is displayed in GameStatus, so surely the timer logic belongs there. This is wrong, and it’s a subtle, experience-dependent wrong.

The timer is part of the game’s state machine, not part of a display widget. It starts when the game starts. It stops when the game ends. Those decisions are game-rule decisions, not UI decisions. If you put the timer in the component, you’ve created a hidden dependency between the display layer and the game logic layer – and you’ve made GameStatus impossible to test in isolation, because now it has side effects.

Separating it into useGameState and passing time down as a prop means GameStatus becomes trivially testable: render it with time={42} and check that it displays 042. No fake timers needed. No act(() => jest.advanceTimersByTime(...)) ceremony. Just render and assert.

This is a pattern I’ve had to explain to some experienced engineers. It’s not intuitive until you’ve been burned by the alternative.


Prompt 6 — Test strategy

> What's the right test strategy to hit high coverage?
  Which parts need the most thorough testing?

Claude:

| Layer          | Test type                      | Priority                              |
|----------------|--------------------------------|---------------------------------------|
| `boardFactory` | Unit (pure functions) | **Critical** — seeds the entire game |
| `gameEngine` | Unit (pure functions) | **Critical** — all rules live here |
| `useGameState` | Hook unit with `renderHook` | **High** — state transitions |
| `Cell` | Component (render+interaction) | **High** — most interactive unit |
| `GameBoard` | Component | **Medium** — layout + delegation |
| `GameControls` | Component | **Medium** — button wiring |
| `GameStatus` | Component | **Medium** — display formatting |
| `App` | Integration | **Low** — smoke + wiring |

Key scenarios to cover in boardFactory:
- Exact mine count after placeMines.
- No mine inside the 3×3 safe zone.
- Correct adjacency counts for known board layouts.

Key scenarios to cover in gameEngine:
- Reveal: single cell, flood-fill, skips flags, skips revealed.
- toggleFlag: add/remove, skips revealed cells.
- revealAllMines: exposes every mine, leaves non-mines hidden.
- checkWinCondition: true only when every non-mine is revealed.

Hook tests will use jest.spyOn to control mine placement deterministically.

I’ve worked in organizations where test coverage was treated as a compliance checkbox – get to 80% line coverage, tick the box, move on. That approach gives you a lot of tests that don’t actually tell you anything meaningful, and misses the critical cases that would have caught the production incident at 2 AM – my wife still recalls the time I was woken up by production alert calls on my phone.

The priority ordering here reflects actual engineering judgment. The engine modules are the most critical because they contain the most logic and are the most independent. If checkWinCondition has a bug, the entire win detection is broken – and that bug will be invisible to every component test because the components are just rendering what they’re given. Testing logic at the lowest, most isolated level is how you find the real bugs.

The note about jest.spyOn to control placeMines is also sharp. Math.random() is the enemy of deterministic tests. A test that “usually passes” because it relies on a random mine not landing on a particular cell is not a test — it’s a probability calculation. Injecting a known board via a spy means the test documents exactly what scenario it covers.


Prompt 7 – Final architecture sign-off

> Summarise the file tree we're committing to before we start coding.

Claude:

game/
├── package.json
├── public/
│   └── index.html
└── src/
    ├── index.jsx
    ├── App.jsx
    ├── App.css
    ├── App.test.jsx
    ├── setupTests.js
    ├── index.css
    │
    ├── constants/
    │   └── gameConfig.js          ← DIFFICULTIES, GAME_STATUS, colours
    │
    ├── engine/
    │   ├── boardFactory.js        ← createCell, createEmptyBoard,
    │   ├── boardFactory.test.js     placeMines, calculateAdjacentMines
    │   ├── gameEngine.js          ← revealCell, toggleFlag,
    │   └── gameEngine.test.js       revealAllMines, checkWinCondition
    │
    ├── hooks/
    │   ├── useGameState.js        ← all game state + timer
    │   └── useGameState.test.js
    │
    └── components/
        ├── Cell/
        │   ├── Cell.jsx
        │   ├── Cell.css
        │   └── Cell.test.jsx
        ├── GameBoard/
        │   ├── GameBoard.jsx
        │   ├── GameBoard.css
        │   └── GameBoard.test.jsx
        ├── GameControls/
        │   ├── GameControls.jsx
        │   ├── GameControls.css
        │   └── GameControls.test.jsx
        └── GameStatus/
            ├── GameStatus.jsx
            ├── GameStatus.css
            └── GameStatus.test.jsx

All engine code is pure JavaScript with zero React imports — it can be ported to any framework without modification. Every public function is exported for individual testing.

I printed this file tree out. Not because I needed to but because I wanted to look at it the way I’d look at something my engineer would have produced in a system design session.

It’s clean. The separation of engine from hooks from components is consistent and principled. The naming is clear without being verbose. There are no mystery folders, no utils/ catch-all, no helpers/ where things go to be forgotten. Both indications of existence of god object – a software design antipattern.

Had someone showed it to me without mentioning how it was produced, my response would have been the same – “That’s a well-structured ReactJS project. Who wrote it?”


What we decided

| Decision | Rationale |
|--------------------------------|------------------------------------------------------------|
| Deferred mine placement | Guarantees safe first click; keeps seeding logic pure and testable |
| Immutable board updates | All engine functions return new arrays — easy to test, easy to debug |
| BFS flood-fill | Iterative, no stack overflow risk on expert (16×30) boards |
| Timer in hook, not component | Keeps `GameStatus` a pure display component with no side effects |
| Config-driven difficulties | Adding a new preset requires one object entry, zero code changes |
| `React.memo` on Cell/GameBoard | Avoids re-rendering 480 cells every time the timer ticks |

Every one of these decisions could be argued differently. You could make a case for mutable board updates and gain some performance on large grids. You could put the timer in the component and argue it’s simpler. You could hardcode the difficulties and save a layer of indirection.

What strikes me is that Claude’s reasoning for each decision was defensible and internally consistent. It wasn’t producing decisions that sounded right but were subtly flawed under scrutiny – it was reasoning from principles and reaching conclusions that would survive a design review. That’s the bar I’d set for my lead engineers. It cleared it without breaking a sweat.


Up next

The planning took roughly thirty minutes of conversation – a luxury I rarely give myself in day-to-day engineering work, where the pressure to ship often compresses “thinking time” into a paragraph of comments in a Team channel. But the planning session was, I think, the most important part of this experiment. Every implementation decision in Part 2 traces back to something agreed here.

In Part 2 I’ll have claude implement every file in this plan and run the full test suite.

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *