Building a Game Boy to Teach an AI Tetris

Sascha

Команда форума
Администратор
Ofline
https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1gzy4z5kvh9gp2q838ac.png


A few months ago I asked myself: do I actually understand how a CPU works, or do I just think I do?

I've been writing software for years. I know what a CPU is, roughly. I've heard the words: registers, opcodes, clock cycles. But if you asked me to explain exactly what happens between the moment a program starts and the first pixel appearing on screen, I'd be hand-waving pretty fast.

So I decided to stop hand-waving.


Why a Game Boy​


I wanted something real to dig into. Not a toy CPU invented for a textbook, but an actual piece of hardware with real constraints, real quirks, and real games running on it.

The Game Boy fit. It's a complete computer (CPU, display, audio, input, memory) squeezed into something that runs on two AA batteries. Simple enough that one person can understand the whole thing. Complex enough that it's actually interesting.

It also has extraordinary documentation. The Pan Docs reads like a textbook written by people who actually reverse-engineered the hardware by hand. Every register, every timing quirk, every undocumented behavior is in there.

Here's a rough picture of what you're working with:

  • CPU: Sharp SM83 at ~4.19 MHz. A custom chip, somewhere between a Z80 and an 8080 but not quite either. 8-bit registers, 16-bit address bus, 245 opcodes plus a 256-opcode extension table.
  • Memory: 64KB address space. 8KB of work RAM, 8KB of video RAM. Most of the rest is wired to the cartridge.
  • PPU: The display chip. Produces a 160×144 image by rendering one scanline at a time, in lockstep with the CPU clock. No framebuffer.
  • Audio: 4 channels (two square waves, a wave channel, a noise generator), all hardware-driven.
  • Cartridge: The game ROM, and sometimes more: extra RAM, a real-time clock, or a Memory Bank Controller that works around the 32KB ROM limit by swapping banks in and out at runtime.

Everything runs in parallel. The CPU doesn't talk to the PPU or the audio chip directly. It reads and writes memory addresses, and the hardware responds. That's the whole model.

I also had a second reason for choosing this project. I've always wanted to get into reinforcement learning in a hands-on way, and a Game Boy emulator is basically a perfect RL environment out of the box: you have a display, a fixed set of actions (the buttons), and games with clear objectives. Once the emulator runs, you can plug an agent in with very little extra work. The end goal of this series is getting an AI to play Tetris, and building the emulator from scratch is step one.


The Plan​


Код:
Phase 1 — Game Boy emulator    ← here
Phase 2 — AI agent playing Tetris



For now, Phase 1.


The Memory Map​


The CPU has a 16-bit address bus, so it can address 65,536 bytes, from 0x0000 to 0xFFFF. But most of that space isn't RAM. Different address ranges are physically wired to different pieces of hardware.


Код:
0x0000–0x7FFF   Cartridge ROM
0x8000–0x9FFF   Video RAM (PPU)
0xA000–0xBFFF   External RAM (cartridge)
0xC000–0xDFFF   Work RAM
0xFE00–0xFE9F   OAM (sprite table)
0xFF00–0xFF7F   I/O registers
0xFF80–0xFFFE   High RAM
0xFFFF          Interrupt Enable register



Cartridge ROM (0x0000–0x7FFF): the game itself. The first 16KB (bank 0) is always mapped here. The second 16KB can be swapped out by the MBC to access different parts of a larger ROM.

Video RAM (0x8000–0x9FFF): where tile graphics and the tile map live. The PPU reads directly from this region to build each scanline. The CPU can write here too, but only when the PPU isn't using it.

External RAM (0xA000–0xBFFF): optional RAM on the cartridge itself. Some games use it for save data. It only exists if the cartridge has the chip, otherwise reads return garbage.

Work RAM (0xC000–0xDFFF): the general-purpose RAM where the game keeps its state. Stack, variables, temporary buffers. The region 0xE000–0xFDFF is an echo of this same memory (a hardware quirk), but in practice no game uses it.

OAM (0xFE00–0xFE9F): Object Attribute Memory, the sprite table. Each sprite takes 4 bytes (position X/Y, tile index, flags). The Game Boy can display up to 40 sprites total, but only 10 per scanline.

I/O registers (0xFF00–0xFF7F): this is where things get interesting. Every piece of hardware exposes its controls here. The joypad, the LCD, the audio channels, the timer, the DMA controller. Writing to 0xFF46 triggers a bulk copy of sprite data. Writing to 0xFF40 turns the LCD on or off. It's a dense 128 bytes.

High RAM (0xFF80–0xFFFE): a small fast RAM region, sometimes called HRAM or zero-page. Useful during DMA transfers, when the CPU can't access most of the memory map and HRAM is the only safe place to run code from.

Interrupt Enable (0xFFFF): a single byte. Each bit enables or disables a specific interrupt (VBlank, LCD STAT, Timer, Serial, Joypad). Paired with the IF register at 0xFF0F, which tracks which interrupts are currently pending.

When the CPU writes to 0xFF40, it's not writing to RAM. It's flipping a bit in the PPU's LCD control register. When it reads from 0xFF00, it's checking which buttons are pressed. The address is the interface. No system calls, no drivers, just a read or a write to a specific number.

The component that sits in the middle and routes all of this is the MMU. Every single memory access goes through it:


Код:
class MMU:
    def read(self, addr: int) -> int:
        if addr < 0x8000:
            return self.cartridge.read(addr)
        elif 0xC000 <= addr <= 0xDFFF:
            return self.wram[addr - 0xC000]
        # ... etc



This was the first thing that clicked for me when I started building this. The CPU has no idea what's at any given address. It just calls read(addr). Whether that comes back from ROM, from a hardware register, or from the sprite table is entirely the MMU's problem.

That also makes the emulator surprisingly straightforward to structure. Each piece of hardware becomes a Python object. The MMU is a dispatcher. You add components one at a time, wire them up, and the CPU just keeps calling read and write.


What's Next​


The MMU routes everything, but it needs something to route to. The first real component is the cartridge, the ROM file itself.

Every Game Boy cartridge starts with a 336-byte header: game title, ROM size, and whether there's a Memory Bank Controller on board. That last part matters because the address space only reserves 32KB for cartridge ROM and most games are bigger than that. The MBC is the workaround, extra circuitry that swaps ROM banks in and out while the game runs.

Next post: loading a ROM, reading the header, and wiring up the simplest case. No bank switching, ROM only, 32KB.



More in the next post.

 
Назад
Сверху Снизу