aboutsummaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc')
-rw-r--r--doc/Arcinfo124
-rw-r--r--doc/DZ.COMbin3683 -> 0 bytes
-rw-r--r--doc/LZ.COMbin3310 -> 0 bytes
-rw-r--r--doc/LZDZ.zipbin52115 -> 0 bytes
-rw-r--r--doc/alf14.atrbin92176 -> 0 bytes
-rw-r--r--doc/alf14_doc.txt266
-rw-r--r--doc/fileformat.txt80
-rw-r--r--doc/interview.txt166
-rw-r--r--doc/review.txt44
9 files changed, 0 insertions, 680 deletions
diff --git a/doc/Arcinfo b/doc/Arcinfo
deleted file mode 100644
index 6c9d500..0000000
--- a/doc/Arcinfo
+++ /dev/null
@@ -1,124 +0,0 @@
-
-ARC-FILE.INF, created by Keith Petersen, W8SDZ, 21-Sep-86, extracted
-from UNARC.INF by Robert A. Freed.
-
-From: Robert A. Freed
-Subject: Technical Information for ARC files
-Date: June 24, 1986
-
-Note: In the following discussion, UNARC refers to my CP/M-80 program
-for extracting files from MSDOS ARCs. The definitions of the ARC file
-format are based on MSDOS ARC512.EXE.
-
-ARCHIVE FILE FORMAT
--------------------
-
-Component files are stored sequentially within an archive. Each entry
-is preceded by a 29-byte header, which contains the directory
-information. There is no wasted space between entries. (This is in
-contrast to the centralized directory used by Novosielski libraries.
-Although random access to subfiles within an archive can be noticeably
-slower than with libraries, archives do have the advantage of not
-requiring pre-allocation of directory space.)
-
-Archive entries are normally maintained in sorted name order. The
-format of the 29-byte archive header is as follows:
-
-Byte 1: 1A Hex.
- This marks the start of an archive header. If this byte is not found
- when expected, UNARC will scan forward in the file (up to 64K bytes)
- in an attempt to find it (followed by a valid compression version).
- If a valid header is found in this manner, a warning message is
- issued and archive file processing continues. Otherwise, the file is
- assumed to be an invalid archive and processing is aborted. (This is
- compatible with MS-DOS ARC version 5.12). Note that a special
- exception is made at the beginning of an archive file, to accomodate
- "self-unpacking" archives (see below).
-
-Byte 2: Compression version, as follows:
-
- 0 = end of file marker (remaining bytes not present)
- 1 = unpacked (obsolete)
- 2 = unpacked
- 3 = packed
- 4 = squeezed (after packing)
- 5 = crunched (obsolete)
- 6 = crunched (after packing) (obsolete)
- 7 = crunched (after packing, using faster hash algorithm) (obsolete)
- 8 = crunched (after packing, using dynamic LZW variations)
-
-Bytes 3-15: ASCII file name, nul-terminated.
-
-(All of the following numeric values are stored low-byte first.)
-
-Bytes 16-19: Compressed file size in bytes.
-
-Bytes 20-21: File date, in 16-bit MS-DOS format:
- Bits 15:9 = year - 1980
- Bits 8:5 = month of year
- Bits 4:0 = day of month
- (All zero means no date.)
-
-Bytes 22-23: File time, in 16-bit MS-DOS format:
- Bits 15:11 = hour (24-hour clock)
- Bits 10:5 = minute
- Bits 4:0 = second/2 (not displayed by UNARC)
-
-Bytes 24-25: Cyclic redundancy check (CRC) value (see below).
-
-Bytes 26-29: Original (uncompressed) file length in bytes.
- (This field is not present for version 1 entries, byte 2 = 1.
- I.e., in this case the header is only 25 bytes long. Because
- version 1 files are uncompressed, the value normally found in
- this field may be obtained from bytes 16-19.)
-
-
-SELF-UNPACKING ARCHIVES
------------------------
-
-A "self-unpacking" archive is one which can be renamed to a .COM file
-and executed as a program. An example of such a file is the MS-DOS
-program ARC512.COM, which is a standard archive file preceded by a
-three-byte jump instruction. The first entry in this file is a simple
-"bootstrap" program in uncompressed form, which loads the subfile
-ARC.EXE (also uncompressed) into memory and passes control to it. In
-anticipation of a similar scheme for future distribution of UNARC, the
-program permits up to three bytes to precede the first header in an
-archive file (with no error message).
-
-
-CRC COMPUTATION
----------------
-
-Archive files use a 16-bit cyclic redundancy check (CRC) for error
-control. The particular CRC polynomial used is x^16 + x^15 + x^2 + 1,
-which is commonly known as "CRC-16" and is used in many data
-transmission protocols (e.g. DEC DDCMP and IBM BSC), as well as by
-most floppy disk controllers. Note that this differs from the CCITT
-polynomial (x^16 + x^12 + x^5 + 1), which is used by the XMODEM-CRC
-protocol and the public domain CHEK program (although these do not
-adhere strictly to the CCITT standard). The MS-DOS ARC program does
-perform a mathematically sound and accurate CRC calculation. (We
-mention this because it contrasts with some unfortunately popular
-public domain programs we have witnessed, which from time immemorial
-have based their calculation on an obscure magazine article which
-contained a typographical error!)
-
-Additional note (while we are on the subject of CRC's): The validity
-of using a 16-bit CRC for checking an entire file is somewhat
-questionable. Many people quote the statistics related to these
-functions (e.g. "all two-bit errors, all single burst errors of 16 or
-fewer bits, 99.997% of all single 17-bit burst errors, etc."), without
-realizing that these claims are valid only if the total number of bits
-checked is less than 32767 (which is why they are used in small-packet
-data transmission protocols). I.e., for file sizes in excess of about
-4K bytes, a 16-bit CRC is not really as good as what is often claimed.
-This is not to say that it is bad, but there are more reliable methods
-available (e.g. the 32-bit AUTODIN-II polynomial). (End of lecture!)
-
- Bob Freed
- 62 Miller Road
- Newton Centre, MA 02159
- Telephone (617) 332-3533
-
-
diff --git a/doc/DZ.COM b/doc/DZ.COM
deleted file mode 100644
index 7a91ef0..0000000
--- a/doc/DZ.COM
+++ /dev/null
Binary files differ
diff --git a/doc/LZ.COM b/doc/LZ.COM
deleted file mode 100644
index 40f7f11..0000000
--- a/doc/LZ.COM
+++ /dev/null
Binary files differ
diff --git a/doc/LZDZ.zip b/doc/LZDZ.zip
deleted file mode 100644
index be3d6bc..0000000
--- a/doc/LZDZ.zip
+++ /dev/null
Binary files differ
diff --git a/doc/alf14.atr b/doc/alf14.atr
deleted file mode 100644
index 1801fc9..0000000
--- a/doc/alf14.atr
+++ /dev/null
Binary files differ
diff --git a/doc/alf14_doc.txt b/doc/alf14_doc.txt
deleted file mode 100644
index 832f59d..0000000
--- a/doc/alf14_doc.txt
+++ /dev/null
@@ -1,266 +0,0 @@
- AlfCrunch Documentation Revised 7/10/88
- -----------------------
-
- AlfCrunch is an implementation of the Lempel-Ziv compression
- algorithm. Although it produces files that have the same structure as
- those produced by the Arc program, the two are not compatible. Arc
- cannot uncrunch AlfCrunch files, nor can AlfUnCrunch unarc normal Arc
- files.
-
- The current version of the LZ/DZ files is 1.4. Versions 1.1 through 1.3
- are compatible, but not with 1.0. If you have 1.0, you should discard it
- and use 1.4. The reason for this is that 1.0 used the same header as
- normal Arc crunch. Because of possible confusion over this, the header
- used by AlfCrunch was changed. Since 1.0 had very limited distribution,
- this situation should not often arise. For those who wish to be able to
- detect the AlfCrunch format, the first two bytes of the file will always
- be $1A $0F.
-
- This version fixes an annoying bug in both v1.2 and 1.3. If you had a
-subdirectory entry amongst the filenames you were crunching, LZ would
-stop at the subdir entry. Also the stack errors will now cause a proper
-exit to Dos rather than re-execution.
-
- Enhancements to v1.4 are the addition of time/date support. If you
-are running under Sparta 3.2, LZ will store the Sparta date/time from each
-file into the header. DZ does not use this information, it's just there to
-provide a reference point.
-
- When running either LZ.COM or DZ.COM, Memlo must be under $3000. This
- should not normally be a problem unless you have a lot of handlers
-installed.
- A cartridge may be present, as it only affects the size of the buffer
- available to AlfCrunch. Maximum speed will be achieved without a
- cartridge being present.
-
- A final note
- ------------
-
- Well I think this is about as far as AlfCrunch is going to get for now. I
-don't really believe there are any more features to add without modifying the
-command line parameters. So this version (1.4) will be the last for
-some time to come. Except for bug fixes (few if any I hope) the 1.x line will
-not change. I hope to add command line parameters similar to ARC and maybe
-add the ARC compression methods to finally resolve the compatibility issue.
-
- Alfred
- Programmer's Aid BBS
- (416) 465-4182
-
- Running AlfCrunch
- -----------------
-
- To crunch files, load LZ.COM. The title will be displayed, along
- with the version which should be 1.4. You will then be prompted for
- the output filename. This may be up to 80 characters long,
- including subdirectory names.
-
- If the output file already exists, it is checked to see if it is an
-AlfCrunch file. If the first header is correct, then the new files will be
-appended to it. If the header is wrong the program will print an error
-message and exit to Dos. If the file is shorter than the header length
-(29 bytes), then it is simply opened for normal output, which erases it.
-
- Next you will be prompted for the input filemask. This is what will
- be used to select the files. This may also be up to 80 characters long,
- including any subdirectory names. Wildcards are allowed. If selecting
- all files, the mask must end in *.* .
-
- Finally, you have the option of turning the screen off. Selecting
- this option will speed up the program by 15-20%. Once selected, you will
- not again be prompted for this option. If you do not elect to turn the
- screen off, the program will continue to present this prompt until it is
- selected.
-
- The program will then select files using the mask and compress them,
- displaying the filenames as it progresses. When it has finished, it will
- prompt you for additional input filemasks. You may either enter another
- mask or simply press return to exit back to Dos.
-
- LZ and SpartaDos 3.2
- --------------------
-
- If you are using SpartaDos 3.2, you may invoke LZ.COM and specify
- the output file and input filemask on the command line. The format is:
-
- [Dn:]LZ Dn:[path>]filename[.ext] [Dn:[path>]filename[.ext] ]
-
- The square brackets denote optional parameters which may be omitted.
- The first filename is the output file. The second is the input
- filemask. If you do not specify the input filemask, the program will
- prompt you for it. The program will automatically turn the screen off.
- When it is finished it will prompt you for more input filemasks.
-
- To invoke LZ as part of a batch file, the format is almost identical.
- The lines in the batch file would be:
-
- [Dn:]LZ Dn:[path>]filename[.ext] [Dn:[path>]filename[.ext] ]
- Dn:[path>]filename[.ext] <- Additional
- Dn:[path>]filename[.ext] input masks
-
- The program will read each input filemask, compress the files
- selected and continue until all the input masks have been used. You will
- then be prompted for more input masks. If this is part of a larger batch
- file, leave a single return after the last input mask to force LZ to
- return control back to the batch file. Example:
-
- [Dn:]LZ Dn:[path>]filename[.ext] [Dn:[path>]filename[.ext] ]
- Dn:[path>]filename[.ext]
- Dn:[path>]filename[.ext]
- (single return here)
- [Dn:]LZ Dn:[path>]filename[.ext] [Dn:[path>]filename[.ext] ]
- Dn:[path>]filename[.ext]
- Dn:[path>]filename[.ext]
- (single return here)
-
- At the end of this, you will be left at the Dos prompt. Because of
- the way i/o redirection is handled, an alternative form is available:
-
- [Dn:]LZ
- Dn:[path>]filename[.ext] <- The output file
- Dn:[path>]filename[.ext] <- The input filemask
- Y <- Turn the screen off
- Dn:[path>]filename[.ext] <- Additional
- Dn:[path>]filename[.ext] <- input filemasks
- (single return here)
-
- Notice that the Y was only supplied once. When LZ is run in this
- manner, it behaves exactly as if you were pressing the keys yourself. If
- you turn the screen off, then you need only enter the Y once. If you
- said N, then you would need an N after every input filemask until you
- said Y. Example:
-
- [Dn:]LZ
- Dn:[path>]filename[.ext] <- The output file
- Dn:[path>]filename[.ext] <- The input filemask
- N <- Leave the screen on
- Dn:[path>]filename[.ext] <- Additional mask
- N <- Leave the screen on
- Dn:[path>]filename[.ext] <- Additional mask
- Y <- Screen off now
- Dn:[path>]filename[.ext] <- Additional masks, but no Y
- Dn:[path>]filename[.ext] <- is necessary
- (single return here)
-
- Getting Them Back
- -----------------
-
- To extract the files from an Alfcrunch file, load DZ.COM The title
- will be displayed, along with the version number.
-
- The first prompt is for the name of the file to uncrunch. This
- filename may be up to 80 characters long, including subdirectory names.
- Wildcards are not allowed.
-
- The next prompt is the output directory. This is the directory where
- the files will be placed when extracted from the crunch file. If the
- directory does not exist, an attempt will be made to create the
- directory. This may involve creating a number of subdirectories to get
- to the last one, so care should exercised with this feature. If
- errors occur during the directory build stage, an error message will be
- displayed, and the program will return to DOS. You may specify a wildcard to
-only extract certain files or use '*.*' to extract them all. *.* is the default.
-
- Auto directory creation is only available under SpartaDos. Under
- any other Dos, if you specify a subdirectory, you will probably get
-a single file with the name of the first pathname.
-
- Assuming all is well, you again have the option of turning the screen
- off while files are being extracted.
-
- The program will then extract each file and place it in the output
- directory specified. If any errors occur, an error message is printed
- and the program returns to Dos. When all files have been extracted, you
- will be prompted for another input file. You may enter another filename
- or press Return to exit to Dos.
-
- The situation may arise where the crunch file has been corrupted.
- This may occur due to errors during download, or failure of the disk on
- which the file resides. There are several error messages which are
- associated with bit errors.
-
- Msg: Not An AlfCrunch File!
- ---------------------------
- If this message is issued before any files were extracted, then
- either the first two bytes of the file are corrupt, or else the file was
- not created by AlfCrunch. If the message is issued after several files
- were extracted, then the file has been damaged somewhere in the last
- file extracted. You may also get the message which is described next.
-
- Msg: File Checksum In Error
- ---------------------------
- DZ has detected that the checksum calculated for the filename just
- extracted does not agree with the checksum in the header block. Either
- the header block has been damaged or more likely, the file itself has
- been corrupted. If the file is a text file, it may be partially correct.
- Object file types should be discarded, as it must be assumed they are
- corrupt.
-
- Msg: Stack Overrun
- ------------------
- This is an internal DZ error. The file being processed has been
- corrupted, and DZ has exhausted all free memory in attempting to extract
- the data. The output file produced is incomplete, corrupt, and should be
- discarded.
-
- Msg: Extra Bytes At Eof, Don't Add To File
- ------------------------------------------
- This means that the file has extra data at the end which is not valid.
-This may arise from downloading where the last block is padded. Do not add
-new files to it with LZ as you will not be able to get them back when you run
-DZ again. You will get the 'Not An AlfCrunch File!' message at that time.
-
- DZ and SpartaDos 3.2
- --------------------
- If you are using SpartaDos 3.2, you may invoke DZ.COM and specify
- the input file and output directory on the command line. The format is:
-
- [Dn:]DZ Dn:[path>]filename[.ext] [Dn:[path>][*.*]
-
- The square brackets denote optional parameters which may be omiited
- if you wish. The first filename is the file to be processed. The second
- filename is the directory in which the output files are to be placed.
- Remember, if any of the directories in the output path do not exist, an
- attempt will be made to create them. Remember, you can use a wildcard to
-limit the files or take the default
-which is '*.*'.
-
- The program will automatically turn the screen off, and extract
- the files. If any errors occur, the appropriate error message will
- be printed and control will return to Dos.
-
- When DZ is finished with the current input file, it will again prompt
- you for another input file. You may continue uncrunching files, or
- simply press return to exit back to Dos.
-
- As part of a batch file, the form for DZ is almost identical to the
- LZ form. Accordingly, only brief examples will be shown:
-
- [Dn:]DZ Dn:[path>]filename[.ext] [Dn:[path>][*.*]
- Dn:[path>]filename[.ext] <- Second input file
- Dn:[path>][*.*] <- Second output path
- Dn:[path>]filename[.ext] <- Third input file
- Dn:[path>][*.*] <- Third output path
- (single return) <- Return to Dos
-
- The second format is:
-
- [Dn:]DZ Dn:[path>]filename[.ext] <- First input file
- Dn:[path>][*.*] <- First output path
- Dn:[path>]filename[.ext] <- Second input file
- Dn:[path>][*.*] <- Second output path
- Dn:[path>]filename[.ext] <- Third input file
- Dn:[path>][*.*] <- Third output path
- (single return) <- Return to Dos
-
- The third format is:
-
- [Dn:]DZ
- Dn:[path>]filename[.ext] <- First input file
- Dn:[path>][*.*] <- First output path
- Y <- Screen off
- Dn:[path>]filename[.ext] <- Second input file
- Dn:[path>][*.*] <- Second output path
- Dn:[path>]filename[.ext] <- Third input file
- Dn:[path>][*.*] <- Third output path
- (single return) <- Exit to Dos
diff --git a/doc/fileformat.txt b/doc/fileformat.txt
deleted file mode 100644
index 7d87000..0000000
--- a/doc/fileformat.txt
+++ /dev/null
@@ -1,80 +0,0 @@
-ALF Archive Structure
----------------------
-
-An ALF archive is laid out almost exactly like an ARC archive that
-only uses compression types 2 or greater: A 29-byte header for each
-file, followed by the compressed data, followed by either EOF or the
-next file's header.
-
-See the file Arcinfo for the original ARC file format. For ALF files,
-"Byte 2: Compression version" will always be $0F.
-
-Header structure:
-
-Offset | Length | Description
--------+--------+------------------------------------------------------
-0 | 2 | ALF signature bytes: $1A $0F
-2 | 13 | Filename (null-terminated)
-15 | 4 | 32-bit compressed size (little-endian)
-19 | 2 | File date in MS-DOS format (same as ARC)
-21 | 2 | File time in MS-DOS format (same as ARC)
-23 | 2 | Checksum (simple additivie, *not* a CRC)
-25 | 4 | 32-bit original size (little-endian)
--------+--------+------------------------------------------------------
-
-The compressed data for the file starts at offset 29.
-
-The differences are:
-
-- ALF files use $0F for the 'compression type' (offset 1), whereas
- ARC files use compression types 1 through 8.
-
-- ALF always uses the 29-byte header; ARC uses 29-byte headers for
- compression types >= 2, but only 25 bytes for type 1 (stored).
-
-- The actual compressed data is incompatible with any of the
- compression types supported by ARC. Although ALF uses an
- implementation of Lempel-Zev, it's not the same implementation
- as any of the ones that ARC uses.
-
-- For ARC, the last file's compressed data is followed by a 0 byte
- (in place of the $1A header), to signal "end of archive". For
- ALF, there's no data after the last byte of the last compressed
- file.
-
-- Because ALF doesn't use a 0 byte to signal end-of-archive, it's
- possible to append two ALF archives together; the result is also
- a valid ALF archive... unless there's "junk at EOF" on the first
- file.
-
-- ARC uses CRC-16 for its checksums; ALF just adds the bytes together
- and uses the low 16 bits of the result as the checksum.
-
-- Not really a file format difference, but the dates stored inside
- ALF files might be wrong or gibberish, if they were created on
- an Atari DOS other than SpartaDOS (or, on SpartaDOS, but without
- the R-Time 8 cartridge).
-
-- ARC and ALF are both limited to 12 character filenames, with a
- null terminator. With ALF, any remaining bytes in the field after
- the null will be set to $20 (ASCII spaces, *not* more nulls).
-
-- Atari filenames with no extensions (e.g. "FOO") are stored with
- a trailing period (e.g. "FOO.") in the ALF header. Upon extraction,
- Atari DOSes will remove the period, so the file will be called
- "FOO" again. I'm not sure whether the ARC for the Atari shares this
- behaviour, but ARC on MS-DOS or Linux doesn't do this.
-
-- ALF files are never embedded inside a self-extracting executable,
- so the first file's header always starts at the first byte of
- the file.
-
-- ARC and ALF both store the compressed and uncompressed file lengths
- as 32-bit unsigned integers... but the Atari can't deal with really
- large files. From examining the disassembled code of UNALF14.COM,
- it looks like the highest byte isn't even looked at, meaning the
- maximum size for a single file is 16MB. I have actually tested the
- Atari ALF and UNALF programs with an emulator (and emulated hard
- drive) with a file of 200KB in size, and it worked fine.
-
-Author: B. Watson (urchlay@slackware.uk)
diff --git a/doc/interview.txt b/doc/interview.txt
deleted file mode 100644
index e7d375e..0000000
--- a/doc/interview.txt
+++ /dev/null
@@ -1,166 +0,0 @@
-An email interview with Alfred, author of AlfCrunch for the
-Atari 8-bit.
-
-Date: Thu, 20 Nov 2025 12:35:25 -0500
-From: Alfred
-To: B. Watson <urchlay@slackware.uk>
-Subject: Re: UnAlf
-
-On 2025-11-20 12:37 a.m., B. Watson wrote:
-
-> 1. Was AlfCrunch public domain, shareware, or...?
-
-1. AlfCrunch was public domain, although I never did distribute the
-source code, and as I recall nobody ever asked me for it. The programmer
-at ICD, Mike Gustafson did the same as you. He disassembled the DZ and
-added it to their SpartaDos X along with all the ARC routines so they
-could handle almost anything. Bob Puff at CSS did the same, so he could
-add it to his SuperUnArc. He phoned me one night to say his code was
-faster than mine at decompressing an AlfCrunch file. We had a good laugh
-about it.
-
-> 2. Do you have any old disks (or disk images), or paper
-> notes, anything that you used or referred to while developing
-> AlfCrunch? Source would be awesome (if you still have it and are
-> willing to share it). Even just the original distribution disk would
-> be good to have.
-
-2. I didn't distribute it on disk that I can recall, it was either the
-two files posted on my bbs, or perhaps they were Arc'd, I just don't
-recall now. Probably Arc'd because there was a doc file with it.
-
-I've attached the source code for LZ/DZ. This isn't the original which
-was done with Mac/65, it's broken up to use the Six Forks assembler
-which I had just started using for a bit around then.
-
-> 3. Why not ARC compatible compression? You said you ported a PC
-> program you had the source to... was it just not a matter of having
-> the source for ARC? Or did you decide to go for speed rather than
-> compatibility?
-
-3. I didn't have any source code for ARC and I didn't know what the
-various compression algorithms were. I vaguely knew about Huffman from
-work as one of the big software programs used it, but I had no idea how
-it was implemented. I read the LZW paper but I didn't really understand
-it then. Everyone hated Walden's ARC because it was so slow and it was
-bugged, but it was all there was. One day somewhere I ran across some
-guy's implementation of LZW for the pc, and I thought to try porting it
-because it had to be faster than ARC. It was in assembler, so I could
-kind of understand it. I'd seen some of the ARC source but the C code
-was just gibberish to me. It's why my version is so clunky because I was
-doing like you, just porting each x86 instruction to its sort of 6502
-variant. I couldn't make changes to the code because I didn't understand
-what it was doing back then.
-
- After I released the first version someone called me and said their
-Arcviewer didn't work on .alf files, so I quick fixed the header to be
-Arc compatible to the extent you could see what the files were, and
-that's the 1.4 version. So if you run across a 1.2, it's the same except
-for the header. I don't think hardly anyone saw 1.2 except for some
-local people because I released 1.4 so fast.
-
-> 4. Did you ever work on AlfCrunch after the 1.4 release? You mention a
-> couple of possibilities for the next version in your doc file. Did any
-> of that ever materialize (even if unreleased)?
-
-4. I did some work on a LZ 2.0 but I guess I quit on it, I only have
-some source for the LZ part. I must have thought 1.4 was good enough and
-moved on to something else.
-
-> 5. Are you OK with me redistributing the decompression code from UnAlf
-> under the WTFPL license?
->
-> 6. Are you OK with me including your AtariAge handle in my unalf
-> documentation (man page)?
-
-5 & 6. Sure you can distribute whatever you like and mention me in it.
-It's not like there's many people around who would remember me, heh.
-
-LZW is fairly straightforward (now, lol) but it can be a bit hard to get
-the idea from just reading code. The way it works is a single token is
-output that represents a string, hopefully a really long one like:
-
-token = $122= "went to the store and bought" string associated with that
-token. However I think tokens start as 9 bit values, so you actually
-output 9 bits, not just the 8.
-
-So on the compress side, you start with a table of, I think, 8 bit
-tokens, where the value is 0-$FF, which are every possible input value.
-If were only doing say ASCII text, you could cut it down to maybe 100
-tokens, not sure how many non-printables there are like SOL etc.
-
-Anyway, you start reading in bytes of data. You check a byte against the
-table and if you don't find it, you add the next token value which would
-be $100 and save that byte to that token's entry. Now that can't happen
-with the first token, because it has to be one of the starting 256
-bytes. If you find the token, then you remember it and continue by
-reading the next character. Now you're checking the table to see if
-there's a token for 1st+2nd, which there isn't. So you create a new
-token, $100, and add the 2 byte string as its value, and you output the
-token for the first byte. Now the third byte probably doesn't match the
-first byte, so it'll be the same process. Since there's no string of
-3rd+4th, you'll output the token for the third byte, and add a new token
-that represents those two bytes. Now with a good matching data file,
-like text, you'll probably see 1st+2nd again. So when it sees that first
-byte value, it says ok, I have a token for that, so it keeps reading,
-and it sees the second byte and it goes, I have a token for 1+2 too, so
-then it reads the third byte and now it goes, ok, I don't have a token
-for 1+2+3, so it outputs the token for 1+2 and creates a new token and
-stores the string 1+2+3 as it's value.
- So this process just goes on until you run out of data. With a good
-match you'll get longer and longer runs of bytes that match earlier
-strings, so you can get to the point where one token is the same as 40
-characters. That's why LZW is so good. However you run into trouble with
-something like a GIF or JPG because they're all different, you don't get
-runs of bytes. Especially not in JPG because it's already stripped out
-all the runs, which is why JPG files are so small.
-
-The decompress is similar, it just works backwards. You start with the
-same 256 byte table. You read the first token, and it matches, so you
-output the value (the token is the character initially). Since it
-matched, what you would normally do is take the string that you output
-just before this and concatenate the first letter of this string to the
-last output string and add it to the table as a new token value. Since
-there is no previous string when you read the first byte, you do
-nothing. So even if you didn't know what the starting table was, you
-could rebuild it here, because all the initial tokens will be <$100
-because they didn't match anything longer in the beginning of the
-compression, so eventually you will reconstruct the 256 entry table. You
-short-circuit that part by starting with the known table.
-
-So starting with the second token, you end up creating a bunch of second
-level entries, which are the initial table value+something. As long as
-the next token is in the table, you just keep going outputting strings
-and adding new tokens. Now what happens if you get a token that isn't in
-the table. This where the math becomes magic, I don't really understand
-the theory, but it works. You know it had to have just been created by
-the compressor. So the string that this new token represents has to at
-least start with the last token to which some new character was added.
-So you take the last string output and concatenate it's first character
-to itself. So if the last string was the value "ABCD" you create new
-token in the table and add "ABCDA" as it's value, and you output the
-string "ABCDA". And so on.
-
-Now you start with 9 bit tokens I think. At some point on the compress
-side, and on the decompress, when you code to add a new token, it's
-going to take more than 9 bits, you up the bitsize to 10, which also
-changes the highest token value you can have, which I think is what the
-MAXCDE value represents. Because of the limited memory, I think I send
-the clear code at the end of filling a 12 bit table, and start over with
-9. Fortunately on the Atari you don't have giant files, so it doesn't
-reset too often.
-
- There are a couple of special tokens, Clear which when you see it you
-clear the table, and the End token which tells you that it's the last
-token of the data.
-
-A lot of the code in LZ/DZ is the bit twiddling to concatenate the
-varying bitsize tokens in memory. I can't do that sort of math in my
-head, so it's a lot brute force shifting bits left and right to align a
-new token to the last one in memory. The other thing I didn't understand
-is I don't think the code actually stores every full string, maybe it
-does, but at the time I thought the guy was using some scheme whereby he
-was only storing one character with each new token and it somehow was
-chained to the previous string.
-
-That's about all I can tell you about it.
diff --git a/doc/review.txt b/doc/review.txt
deleted file mode 100644
index f56e4c3..0000000
--- a/doc/review.txt
+++ /dev/null
@@ -1,44 +0,0 @@
-The following review was published in the Atari H.A.C.K. magazine,
-in the August 1988 issue (Volume II, Issue IIX) [1]:
-
----------------------------------------------------------------------
-Those of us who are experienced telecommunicators are quite familiar
-with the ARC family of disk file compression programs. The most
-widely used of the 8-bit versions of the ARC program has been,
-and remains to be, ARC version 1.2 (the archiver) and ARCX version
-1.2 (the dearchiver). Two very excellent programs written in C by
-Ralph Walden of the Atari Computer Enthusiasts of Eugene, aka ACE.
-Almost every BBS worth its salt uses this program to compress its
-files not only to make them take up less space, but also to save time
-on file transfers. A smaller program simply takes less time to send
-or receive. Of course, since the file is compressed, or archived, it
-isn't runnable until it's dearced with the ARCX program.
-
-ARC and ARCX are great programs but they have their small problems.
-They are slow and sometimes show unexplainable CRC errors when
-dearcing. This frustrates and detracts from what is otherwise a great
-program. There was none better, that is, until now.
-
-ALFCRUNCH is here. Despite its cute name it has nothing to do with
-the furry wise guy from the planet Melmac. ALFCRUNCH consists of two
-programs, LZ.COM, the archiver, and DZ.COM, the dearchiver. Files are
-manipulated the same way as the ARC programs do it but they are not
-compatible. The LZ program compresses programs slightly more than does
-ARC.COM, or anywhere from a few percent to almost 70%, all depending
-on file type and save method used. The DZ program works as claimed-
-there isn't much to say except that it works. All of this sounds good
-but so what? Why change for a few percent?
-
-The reason to change is speed. ALF programs are at least 10 times
-faster than the ARC programs. Sometimes they are even quicker!
-Programs which may have taken several minutes to process are done
-in seconds with ALF. In fact the first time I tried ALF I thought it
-didn't work... but it does! Reason enough to change? Not yet? Well,
-ALF is free. Get it from your club PD library or download it from
-SLOWPOKE! [2]
----------------------------------------------------------------------
-
-[1] The full issue of HACK can be found here:
- https://archive.org/details/AtariHACKNewsAugust1988
-
-[2] SLOWPOKE was an Atari BBS in the Salem/Portland, Oregon area.