Self-Modifying Code
Code that rewrites itself
Self-modifying code changed its own instructions at runtime, enabling impossible optimisations on 8-bit systems by treating code as data.
Overview
On the 6502, indexing memory costs extra cycles. What if the address in your instruction could change dynamically? Self-modifying code rewrote instructions at runtime—changing addresses, operation codes, or branch targets. It was fast, dangerous, and essential for achieving the impossible on 8-bit hardware.
The problem
Standard indexed addressing:
lda table,x ; 4-5 cycles, X is index
But what if you need to change which table?
The solution
Modify the instruction itself:
lda table ; 4 cycles, absolute addressing
; Elsewhere, change the address:
lda #<new_table
sta load_addr+1 ; Modify low byte
lda #>new_table
sta load_addr+2 ; Modify high byte
Now lda table loads from new_table.
Common uses
Unrolled loops
lda $0400 ; First iteration
sta $d020
lda $0401 ; Second iteration (address modified)
sta $d020
; ...repeat, modifying addresses
Dynamic branch targets
jump_target = *+1
jmp $0000 ; Address modified at runtime
Variable table selection
Switch between data sources without index overhead.
6502 advantages
| Factor | Benefit |
|---|---|
| Von Neumann | Code and data same memory |
| No cache | No stale instruction problems |
| Absolute addressing | Faster than indexed |
Examples
Sprite multiplexer
lda #sprite_y_1
sta store_y+1 ; Modify destination
store_y:
sta $d001 ; Y position (address changes)
Music player
lda pattern_ptr
sta fetch+1
fetch:
lda $0000 ; Address modified each note
Dangers
| Risk | Consequence |
|---|---|
| Timing bugs | Wrong code executed |
| Debugging | Hard to trace |
| Maintenance | Confusing to read |
| Portability | Platform-specific |
Z80 differences
The Z80 has no instruction prefetch or cache in the modern sense, so SMC works the same way as on the 6502 — write to the byte that holds an operand, the next M1 fetch sees the new value. The differences are subtler:
- The R register increments during M1 cycles. Code that uses
LD A,Rfor a "random" seed sees the seed shift if SMC changes instruction lengths nearby. - Refresh cycles put the R register on the address bus during the second half of every M1 — this is invisible to RAM-side SMC but interacts with some peripherals if the address decode is loose.
- Multi-byte instructions are not atomic. A
LD (HL),norLD A,(nn)that targets the next instruction's operand byte after the M1 fetch but before its operand fetch produces a hybrid execution. Real Z80 does this faithfully; emulators with instruction-level abstraction can miss it.
In practice, SMC on Z80 is just as common as on 6502 — Spectrum games use it constantly.
Famous examples
| Game / demo | Use of SMC |
|---|---|
| Mayhem in Monsterland (C64, 1993) | Sprite renderer self-modifies addresses for speed; Apex's "impossible-on-C64" 50 fps comes partly from this |
| Turrican / Turrican II (C64) | Hubbard-era SID drivers patch their own table addresses each frame |
| Knight Lore (Spectrum, Ultimate Play the Game, 1984) | Filmation engine uses SMC to swap render routines per object class |
| Demoscene FLI routines (C64) | The "stable raster" inner loop modifies VIC register write targets to dodge a single cycle of jitter |
| Sprite multiplexers (C64) | Sprite Y-position writes self-modify the destination register so one sprite slot serves many on-screen sprites |
Emulator and JIT compatibility
SMC is a chronic source of bugs in emulators that compile blocks of guest code ahead of time:
- JIT-based emulators (some PSX, GBA, and N64 cores) cache compiled translations of guest code. When the guest writes to that code, the cache is stale. Robust JITs invalidate the affected block on write — costly, and easy to get wrong if the write granularity doesn't match the block boundary.
- Cycle-accurate emulators that step instruction-by-instruction handle SMC for free; the next fetch reads from current memory.
- Static recompilers (often used for retro ports) can't handle SMC at all without falling back to interpretation.
This is why retro emulator authors stay loyal to interpretation despite its host-CPU cost — JIT speedups don't survive contact with a self-modifying Mayhem in Monsterland.
Modern perspective
Self-modifying code today:
- Forbidden by W^X memory protection — modern OSes mark pages either Writable or eXecutable, never both. Programs that legitimately need SMC (JIT compilers) explicitly toggle protection between phases.
- JIT compilation is the modern descendant — generate code at runtime, then mark the page executable.
- Polymorphic malware uses SMC to evade signature scanners.
- Embedded firmware sometimes still uses SMC where Flash and RAM are unified and tight loops need every cycle.
Debugging tips
| Technique | Purpose |
|---|---|
| Mark modifications | Comment clearly |
| Initialise explicitly | Don't assume values |
| Test boundaries | Check modified ranges |