diff options
Diffstat (limited to 'doc/Arcinfo')
| -rw-r--r-- | doc/Arcinfo | 124 |
1 files changed, 0 insertions, 124 deletions
diff --git a/doc/Arcinfo b/doc/Arcinfo deleted file mode 100644 index 6c9d500..0000000 --- a/doc/Arcinfo +++ /dev/null @@ -1,124 +0,0 @@ - -ARC-FILE.INF, created by Keith Petersen, W8SDZ, 21-Sep-86, extracted -from UNARC.INF by Robert A. Freed. - -From: Robert A. Freed -Subject: Technical Information for ARC files -Date: June 24, 1986 - -Note: In the following discussion, UNARC refers to my CP/M-80 program -for extracting files from MSDOS ARCs. The definitions of the ARC file -format are based on MSDOS ARC512.EXE. - -ARCHIVE FILE FORMAT -------------------- - -Component files are stored sequentially within an archive. Each entry -is preceded by a 29-byte header, which contains the directory -information. There is no wasted space between entries. (This is in -contrast to the centralized directory used by Novosielski libraries. -Although random access to subfiles within an archive can be noticeably -slower than with libraries, archives do have the advantage of not -requiring pre-allocation of directory space.) - -Archive entries are normally maintained in sorted name order. The -format of the 29-byte archive header is as follows: - -Byte 1: 1A Hex. - This marks the start of an archive header. If this byte is not found - when expected, UNARC will scan forward in the file (up to 64K bytes) - in an attempt to find it (followed by a valid compression version). - If a valid header is found in this manner, a warning message is - issued and archive file processing continues. Otherwise, the file is - assumed to be an invalid archive and processing is aborted. (This is - compatible with MS-DOS ARC version 5.12). Note that a special - exception is made at the beginning of an archive file, to accomodate - "self-unpacking" archives (see below). - -Byte 2: Compression version, as follows: - - 0 = end of file marker (remaining bytes not present) - 1 = unpacked (obsolete) - 2 = unpacked - 3 = packed - 4 = squeezed (after packing) - 5 = crunched (obsolete) - 6 = crunched (after packing) (obsolete) - 7 = crunched (after packing, using faster hash algorithm) (obsolete) - 8 = crunched (after packing, using dynamic LZW variations) - -Bytes 3-15: ASCII file name, nul-terminated. - -(All of the following numeric values are stored low-byte first.) - -Bytes 16-19: Compressed file size in bytes. - -Bytes 20-21: File date, in 16-bit MS-DOS format: - Bits 15:9 = year - 1980 - Bits 8:5 = month of year - Bits 4:0 = day of month - (All zero means no date.) - -Bytes 22-23: File time, in 16-bit MS-DOS format: - Bits 15:11 = hour (24-hour clock) - Bits 10:5 = minute - Bits 4:0 = second/2 (not displayed by UNARC) - -Bytes 24-25: Cyclic redundancy check (CRC) value (see below). - -Bytes 26-29: Original (uncompressed) file length in bytes. - (This field is not present for version 1 entries, byte 2 = 1. - I.e., in this case the header is only 25 bytes long. Because - version 1 files are uncompressed, the value normally found in - this field may be obtained from bytes 16-19.) - - -SELF-UNPACKING ARCHIVES ------------------------ - -A "self-unpacking" archive is one which can be renamed to a .COM file -and executed as a program. An example of such a file is the MS-DOS -program ARC512.COM, which is a standard archive file preceded by a -three-byte jump instruction. The first entry in this file is a simple -"bootstrap" program in uncompressed form, which loads the subfile -ARC.EXE (also uncompressed) into memory and passes control to it. In -anticipation of a similar scheme for future distribution of UNARC, the -program permits up to three bytes to precede the first header in an -archive file (with no error message). - - -CRC COMPUTATION ---------------- - -Archive files use a 16-bit cyclic redundancy check (CRC) for error -control. The particular CRC polynomial used is x^16 + x^15 + x^2 + 1, -which is commonly known as "CRC-16" and is used in many data -transmission protocols (e.g. DEC DDCMP and IBM BSC), as well as by -most floppy disk controllers. Note that this differs from the CCITT -polynomial (x^16 + x^12 + x^5 + 1), which is used by the XMODEM-CRC -protocol and the public domain CHEK program (although these do not -adhere strictly to the CCITT standard). The MS-DOS ARC program does -perform a mathematically sound and accurate CRC calculation. (We -mention this because it contrasts with some unfortunately popular -public domain programs we have witnessed, which from time immemorial -have based their calculation on an obscure magazine article which -contained a typographical error!) - -Additional note (while we are on the subject of CRC's): The validity -of using a 16-bit CRC for checking an entire file is somewhat -questionable. Many people quote the statistics related to these -functions (e.g. "all two-bit errors, all single burst errors of 16 or -fewer bits, 99.997% of all single 17-bit burst errors, etc."), without -realizing that these claims are valid only if the total number of bits -checked is less than 32767 (which is why they are used in small-packet -data transmission protocols). I.e., for file sizes in excess of about -4K bytes, a 16-bit CRC is not really as good as what is often claimed. -This is not to say that it is bad, but there are more reliable methods -available (e.g. the 32-bit AUTODIN-II polynomial). (End of lecture!) - - Bob Freed - 62 Miller Road - Newton Centre, MA 02159 - Telephone (617) 332-3533 - - |
