From d90c75703d7a9ca53fded2a40fa35344279ed843 Mon Sep 17 00:00:00 2001 From: "B. Watson" Date: Wed, 6 Jan 2016 04:04:57 -0500 Subject: document the compression before I forget --- Makefile | 2 +- comptitle.s.in | 43 ++++++----- newtitle.s | 21 +----- titlecomp.pl | 2 + titlecompression.txt | 205 +++++++++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 237 insertions(+), 36 deletions(-) create mode 100644 titlecompression.txt diff --git a/Makefile b/Makefile index 56a81b4..9cb010b 100644 --- a/Makefile +++ b/Makefile @@ -110,7 +110,7 @@ comptitle.xex: titledata.xex titlecomp.pl comptitle.s.in # to get cc65 to build an init segment (would need a custom linker # script at least). newtitle.xex: newtitle.s ver.dat - cl65 -o newtitle.xex -t none newtitle.s + cl65 -l newtitle.lst -m newtitle.map -o newtitle.xex -t none newtitle.s ver.dat: mkver.pl perl mkver.pl $(VERSION) > ver.dat diff --git a/comptitle.s.in b/comptitle.s.in index 7862b3d..b8ed688 100644 --- a/comptitle.s.in +++ b/comptitle.s.in @@ -1,3 +1,4 @@ +;;; see titlecompression.txt to understand how this works! ; cl65 -o comptitle.xex -t none comptitle.s @@ -37,6 +38,13 @@ table: ; decompression code starts here init: + lda #0 + + ;;; for benchmarking only, remove! + ;sta RTCLOK + ;sta RTCLOK+1 + ;sta RTCLOK+2 + lda #>imgsize sta pagecount lda #>compdata @@ -62,7 +70,22 @@ copyloop: beq notcode ; got a zero-run code. A = number of bytes of zeroes to store. - jmp storezerorun +storezerorun: + sty tmp + tay + dey + lda #0 +szloop: + sta (dstptr),y + jsr incdest + beq done +nohi2: + dey + bpl szloop + + ldy tmp + clc + bcc storedone notcode: lda tmp @@ -83,24 +106,10 @@ storedone: bne copyloop ; always branch (since srcptr never wraps around to 0) done: + ;;; for benchmarking only, remove! + ;.byte $02 ; CIM, drops us to atari800 monitor rts -storezerorun: - sty tmp - tay - dey - lda #0 -szloop: - sta (dstptr),y - jsr incdest - beq done -nohi2: - dey - bpl szloop - - ldy tmp - jmp storedone - incdest: inc dstptr bne iddone diff --git a/newtitle.s b/newtitle.s index 9eef9ed..0d66fa2 100644 --- a/newtitle.s +++ b/newtitle.s @@ -36,6 +36,8 @@ start: lda #0 sta SDMCTL + ; build our display list + ; wait for the next frame, to avoid graphics glitching jsr wait1jiffy @@ -80,24 +82,7 @@ keyok: ; eat the keypress lda #$ff sta CH - - ; restore OS's display list - ;lda #0 - ;sta SDMCTL ; disable screen again - ;lda FR0 - ;sta SDLSTL - ;lda FR0+1 - ;sta SDLSTH - - ;jsr wait1jiffy - - ; switch to normal playfield, enable screen. this is now - ; done in atari_text_setup() in taipan.c, so the title screen - ; can keep displaying. - ;lda #$22 - ;sta SDMCTL - - rts ; return to DOS + rts ; return to DOS (which loads the rest of the game) wait1jiffy: lda RTCLOK+2 diff --git a/titlecomp.pl b/titlecomp.pl index 2e17a92..94d36b1 100644 --- a/titlecomp.pl +++ b/titlecomp.pl @@ -1,5 +1,7 @@ #!/usr/bin/perl -w +### see titlecompression.txt to understand how this works! + $datasize = 0x1700; use bytes; diff --git a/titlecompression.txt b/titlecompression.txt new file mode 100644 index 0000000..9cda223 --- /dev/null +++ b/titlecompression.txt @@ -0,0 +1,205 @@ +The title screen uses a crude form of compression, which I'll call ZRLE +(zero-run length encoding). + +Theory of operation: +-------------------- + +Unlike real RLE, ZRLE only compresses runs of 0-bytes. Non-zero bytes +are stored as-is, not compressed at all. Also unlike real RLE, ZRLE +won't work on arbitrary input data. Why? + +ZRLE relies on some byte values being unused in the input. In other words, +if the file contains every byte value 0 to 255, it can't be compressed +with ZRLE. There needs to be at least one unused value per length of +null run found in the file. + +In other words, if the input looks like: + +01 00 00 02 00 00 03 00 00 00 04 + +...there are 2 lengths of zero-run: 2 and 3. The fact that there are +two runs of length 2 doesn't affect the algorithm (in fact it helps +the compression ratio). + +For each length of zero-run found in the file, an otherwise-unused byte +is assigned as a marker for that length. The above input might be encoded +as: + +01 05 02 05 03 06 04 + +...where 05 represents a run of two zeroes, and 06 represents a run of +three zeroes. + +This means there are no escape bytes, markers, or counts stored in +the file. Each run of zero-bytes collapses into one byte of data, +which we'll call a code value. All bytes that aren't code values, are +just uncompressed data bytes, used as-is. + +If there aren't enough unused values, the encoder will fail, saying it's +out of codes. + +But how can this be decoded? + +The encoder creates a table for the decoder to use, of course. The +table is written to the file comptitle.s, along with the rest of the +decoder code from comptitle.s.in. This means the decoder is specific to +the encoded file, not general-purpose. If we weren't on such a limited +system as the Atari 800, the table would be part of the output file, +not the code (like the "dictionary" in a deflated file, etc). + +Each byte value in the compressed data either represents itself (is a +plain data value) or some number of consecutive zero bytes (is a code +value). The table has an entry for each byte value 0-255 (conceptually +anyway; see below), and if the byte value represents itself, the table +entry will be zero. If it represents a zero-run, the table entry will +be non-zero, and will be the number of bytes in the run. + +For a "run" of one zero-byte, the encoder doesn't bother to encode it as +a run: a plain zero is stored in the output file instead. This doesn't +affect the file size, but does save one encoded value. + +Here's the above example again: + +01 00 00 02 00 00 03 00 00 00 04 + +It encoded to: + +01 05 02 05 03 06 04 + +The table built into the decoder will look like: + +byte value | definition +-----------+----------- +01 | 0 +02 | 0 +03 | 0 +04 | 0 +05 | 2 +06 | 3 + +This tells the decoder that 01, 02, 03 represent themselves (they're +copied from compressed input to decompressed output as-is)... the value +05 represents a run of 2 consecutive zeroes, and 06 represents a run +of 3 zeroes. + +The decoder will look at the input, one byte at a time, and consult +the table to decide what to do with each byte. "Output" here is the +decompressed data, which you can think of as a list of bytes, which +starts out empty. + +input| table lookup | | +data | result | action | output after action is taken +-----+--------------+-----------------+--------------------------------- +01 | 0 | append 01 | 01 + | | | +05 | 2 | append 2 zeroes | 01 00 00 + | | | +02 | 0 | append 02 | 01 00 00 02 + | | | +05 | 2 | append 2 zeroes | 01 00 00 02 00 00 + | | | +03 | 0 | append 03 | 01 00 00 02 00 00 03 + | | | +06 | 3 | append 3 zeroes | 01 00 00 02 00 00 03 00 00 00 + | | | +04 | 0 | append 04 | 01 00 00 02 00 00 03 00 00 00 04 + +As you can see, the final output after the last byte of data is read, +matches the original input exactly. + +In this dumb example, it's pretty obvious that the encoded data plus +the table size is bigger than the original input, so the "compression" +is actually making it bigger. But it serves to illustrate, I hope. + +For Taipan, the title screen (newtitle.png) contains large black areas. +A black pixel is a 0 bit. The title screen data ends up packed 8 pixels +per byte (1 bit per pixel), so those black areas are represented by runs +of consecutive zero bytes. + +Each row of pixels in the image has 256 bits, or 32 bytes. The bytes +are arranged in consecutive order: the leftmost 8 visible pixels are the +first byte of the first row... after enough bytes to fill up the row, the +next byte in the file is the leftmost 8 pixels of the 2nd line, etc etc. + +The algorithm treats the input as a stream of bytes, and doesn't know +nor care how they're arranged on the screen, but if you know how they're +stored you can see that a black area at the right-hand end of one line +is "joined" with a black area at the left-hand end of the next line, +as far as the compression is concerned. + +Specifics: +---------- + +The original image newtitle.png is the authoritative source for the title +screen that ends up being built into the game binary. Its resolution is +256x184 pixels. If this image is ever modified, the Makefile will cause +all the stuff mentioned below to be rebuilt. + +First, a script called newtitle.pl reads the PNG image and creates an +Atari loadable file called titledata.xex. It contains a 6-byte Atari +header followed by the pixel data, one bit per pixel. The pixel data +is 5888 bytes. Formerly, before the compression scheme was invented, +titledata.xex was used as-is in creating the game binary taipan.xex. + +The compression code is written in Perl, in the file titlecomp.pl. It +reads titledata.xex, applies the compression, and writes a file called +comptitle.dat, containing the raw compressed data (minus the table +definition). It also creates an assembly source file comptitle.s, +containing the table definition and the rest of the decoding code. + +If you look at titlecomp.pl, you'll see it takes an optional argument, +forcing it to use a particular byte value as the first code value. This +number is determined by running the script repeatedly with different +arguments, to find the one with the smallest table size. + +Wait, what do I mean by smallest table size? Well, I said the table +contains one entry for each byte value 0 to 255... This isn't really +true. Byte values less than 128 *always* represent themselves, so I +don't bother storing them in the table. Also, the perl script knows the +lowest and highest numbered code values, so it only emits the portion +of the table between these two numbers (and modifies the code to not do +a table lookup, if it's outside the range). If you can read 6502 +assembly, looking at the code will make this much clearer. + +comptitle.s is then assembled, to create comptitle.xex. + +When comptitle.s is assembled, it includes the contents of comptitle.dat +with an ".incbin", so the data ends up in comptitle.xex along with the +table and decoder. This file contains 2 Atari binary load segments: +the data + table + decoder, and an initialization segment telling the +Atari to run the decoder subroutine immediately after loading. + +comptitle.xex ends up being the first part of the complete taipan.xex game +binary, so after its init routine exits, the rest of the game continues +to load. In particular, the next segment is another init routine that +tells the Atari to actually display the decoded title screen data. + +Compression Ratio: +------------------ + +If we want to calculate the compression ratio, the size of comptitle.xex +is the compressed file size, and the size of titledata.xex is the original +file size: + +$ ls -l comptitle.xex titledata.xex +-rw-r--r-- 1 urchlay users 3602 Jan 6 02:08 comptitle.xex +-rw-r--r-- 1 urchlay users 5894 Jan 6 01:55 titledata.xex + +The percentage is: + +$ perl -e 'printf "%.1f\n", 3602/5894*100' +61.1 + +Actually, we have to do the calculations in disk sectors. A standard Atari +floppy disk holds 125 data bytes per sector (or less, but we assume full +sectors because that's how taipan.xex is generated). It's impossible +to read less than a full sector from the drive. So titledata.xex is 48 +sectors, and comptitle.xex is 29. + +What we really care about is loading time, not disk space: the title +screen should be visible as soon as possible after the program starts +loading (to give the user something to look at, while the huge bloated +game code loads). Standard 810 or 1050 drives load around 10 sectors +per second, we saved 19 sectors or around 2 seconds. The decompression +takes 1/5 of a second, so it's a net gain. + -- cgit v1.2.3