Skip to content
Techniques & Technology

Data Compression

Fitting more into less

Compression techniques allowed game developers to fit larger games into limited ROM and RAM by encoding data more efficiently, from simple RLE to sophisticated algorithms.

nintendo-entertainment-systemsuper-nintendosega-mega-drivecommodore-64commodore-amiga datamemoryoptimisation 1970–present

Overview

Data compression makes data smaller for storage, then restores it when needed. For game developers working with kilobytes of ROM, compression wasn't optional — it was essential. Level data, graphics, music, and text all benefited. The techniques ranged from simple run-length encoding to sophisticated dictionary algorithms that squeezed every possible byte. The tradeoff was always the same: less storage, more decompression time and code.

Fast Facts

AspectDetail
PurposeFit more data in limited space
Trade-offStorage space vs decompression time + decoder size
Common targetsGraphics, level data, text, music
Modern relevanceDownload sizes, streaming, GPU texture formats

Why compression mattered

PlatformTypical ROMPractical ceiling
NES32 KB-1 MBROM cost capped many games at 256 KB-512 KB
SNES1-4 MBLargest commercial cart was 6 MB (Tales of Phantasia, Star Ocean)
Mega Drive1-4 MBPhantasy Star IV shipped 24 Mbit (3 MB)
Game Boy32 KB-1 MBPokémon Gold/Silver: 16 Mbit = 2 MB
C64170 KB per disk sideMulti-disk titles: 4-6 disks for big games
Amiga880 KB per OCS diskMulti-disk titles: 6-12 disks routine for late-era games

Every byte saved meant more content. Final Fantasy VI (3 MB SNES) compressed nearly 6 MB of script and graphics into its cartridge.

Common algorithms

Run-Length Encoding (RLE)

Replace repeated values with (count, value) pairs:

OriginalCompressedSaved
AAAAAABBCCC6A 2B 3C5 bytes

Simple, fast, works well for graphics with solid areas. Decoder fits in ~30 bytes of 6502. See Run-Length Encoding for the deep dive.

LZ77 / LZSS / LZ-family

Reference earlier data instead of repeating it. Each token is either a literal byte or a back-reference (offset, length) into recently-decoded data:

Input:    ABCABCABCXYZ
Encoded:  A B C (offset=3, length=6) X Y Z

Six bytes of literal data plus one back-reference covers a 12-byte input — 50% compression on this small example. LZ77 is the foundation of gzip, zip, zlib, and most general-purpose compressors.

VariantUsed in
LZ77Original Lempel-Ziv (1977) — sliding-window dictionary
LZSSLZ77 variant with 1-bit flags — common in console games
LZ78 / LZWDictionary-built variant — used in GIF, early UNIX compress
LZ4 / LZOModern fast variants — Linux kernel, real-time compression

Huffman coding

Assign shorter codes to common values, longer codes to rare ones:

ValueFrequencyCode
A50%0
B25%10
C12.5%110
D12.5%111

Theoretically optimal for known frequencies. Often combined with LZ to compress the LZ output further (this is what gzip does internally — DEFLATE = LZ77 + Huffman).

Dictionary / lookup compression

Replace repeated multi-byte sequences with a dictionary index:

"the quick brown fox" →
  dictionary: 0=the, 1=quick, 2=brown, 3=fox
  encoded:    [0] [1] [2] [3]

Useful for text-heavy games (Final Fantasy, Phantasy Star) where common words ("the", "and", "Battle!") become single bytes.

Delta encoding

Store differences between successive values rather than absolute values. Effective for monotonic-ish data: heightmaps, palette gradients, animation curves.

heights: 100 102 103 105 108 110
deltas:  100  +2  +1  +2  +3  +2

The deltas are smaller numbers, more compressible by RLE or Huffman.

Platform-specific schemes

NES

TechniqueUse
CHR-ROM compressionCompress in PRG-ROM, decompress to PPU memory at scene load
RLE level dataRepeated tile patterns in nametables
Metatiles2×2 or 4×4 tile groups treated as a single index — compresses level data ~4×
Custom per-gameSuper Mario Bros 3's level format is a bespoke RLE/dictionary hybrid

SNES

TechniqueUse
LZSS variantsMost data; dozens of game-specific dialects
Mode 7 compressionHeightmap RLE for backgrounds
Custom schemesFinal Fantasy VI's text uses dictionary + variable-length codes

Mega Drive / Genesis

Sega and its developers shipped multiple proprietary schemes — most named after the Sonic team programmer who wrote them:

FormatUseNotes
KosinskiArt (tiles, mappings, palettes)LZSS variant; the Sonic 1 art compression standard; named after Mark Kosinski
NemesisTile graphicsStatistical encoder using Huffman-like coding for runs
EnigmaTile-map dataDifferential RLE for nametables
SaxmanSound and game dataUsed in some Sega titles

These formats are now well-documented by the Sonic Retro reverse-engineering community; tools exist to decompress them on modern systems.

Amiga

CompressorUseNotes
PowerPackerExecutables and dataDominated 1990s warez and shareware Amiga distribution
LhA (LZH)General-purpose archivesThe ZIP-equivalent of the Amiga era
ByteKillerDemos / 4 KB introsTight LZ-style cruncher
ImploderExecutablesDecompresses-on-load packers

C64

CompressorNotes
ExomizerThe community standard, still active development
ByteBoozerSmaller decoder, slightly worse ratio
Doynax LZOptimised for fast decompression
PuCrunchOlder but widely used

The C64 cruncher landscape is unusually rich — every group had a favourite, and benchmark wars between them ran for years.

Compression targets

Data typeApproach
Tile graphicsPattern-based, RLE, custom per-format
Level mapsRLE + dictionary; metatile indirection
Music dataPattern references (already-compressed by tracker formats); see MOD Format
TextHuffman coding + dictionary
SpritesCustom per-game; transparent-pixel runs compress well
Audio samplesDelta encoding + ADPCM

Trade-offs

FactorConsideration
Compression ratioHigher ratio = more storage saved, but typically more CPU + RAM to decompress
Decompression speedCritical for level loading, scene transitions, real-time streams
RAM requirementDecompressor needs working memory (sliding window for LZ, Huffman tree, etc.)
Decoder sizeDecoder code itself takes ROM — RLE: ~30 bytes; LZSS: ~150 bytes; full Exomizer: ~512 bytes

Real-time decompression (streaming audio, load-during-play) needs fast algorithms. Static data (level loading at scene change) can use slower, better compression.

Implementation considerations

ChallengeSolution
RAM limitsDecompress directly to VRAM/working memory, no intermediate buffer
CPU budgetDecompress during load screens or VBlank waits
Random accessStore block offsets so individual chunks can be decompressed without sequential walk
DMA conflictsOn consoles, time decompression around video DMA windows

Notable examples

GameTechniqueAchievement
Sonic the HedgehogKosinski + Nemesis + EnigmaMultiple compressors per data type
Super Metroid (1994)LZSS variantMassive map in 3 MB cartridge
Kirby's Adventure (1993)Heavy compressionNES MMC5 game with rich graphics in 768 KB
Chrono Trigger (1995)LZSS + dictionaryMulti-language scripts in 4 MB
Pokémon Gold/Silver (1999-2000)Custom compressionTwo complete Kanto/Johto regions in 2 MB

The demo scene connection

Demo scene coders pushed compression limits:

CompoConstraintChampion compressors
64K introEntire demo in 64 KBCrinkler (Windows), Stub (Linux)
4K introExtreme compressionCrinkler, oneKpaq
256-byte introPure code golfHand-crafted; standard packers don't fit

Crinkler combines a custom compiler-aware linker with a context-mixing arithmetic coder, achieving ratios close to the theoretical limit for the kind of code-and-data mix typical of demos.

Techniques developed for demos influenced game development: kkrunchy (Farbrausch's tool) is descended from demo-scene compressors and is used in commercial games.

Modern relevance

ContextApplication
Download sizesSteam, Epic, console store budgets
Load timesSSD streaming with on-the-fly decompression (LZ4, Zstandard)
Texture compressionGPU formats (BC1-7, ASTC, ETC2) — fixed-rate compression for direct GPU sampling
Asset bundlesUnity, Unreal asset packs use Zstandard or LZ4
Network protocolsHTTP gzip, HTTP/2 HPACK, QUIC compression

The principles persist even as storage grows — users still prefer smaller downloads, faster loads, and lower memory pressure.

Legacy

Compression taught developers to think carefully about data representation. The habit of asking "can this be smaller?" persists. Modern game developers still compress assets, optimise network packets, and minimise memory footprints. The stakes are different — gigabytes instead of kilobytes — but the discipline of fitting content into constraints remains valuable, and the algorithms themselves often trace directly back to 1970s-1980s research.

See Also