diff options
author | B. Watson <urchlay@slackware.uk> | 2024-05-17 05:09:45 -0400 |
---|---|---|
committer | B. Watson <urchlay@slackware.uk> | 2024-05-17 05:09:45 -0400 |
commit | 96af9bc891987f6fcc560a6e403c5ada541d8699 (patch) | |
tree | 6bdd20a1fdd7f31316d14fb5e233718b48713522 /unprotbas.rst | |
parent | d4064b55a7ddbb002ef80dbc0db60cd0d95cb1cd (diff) | |
download | bw-atari8-tools-96af9bc891987f6fcc560a6e403c5ada541d8699.tar.gz |
unprotbas: added; blob2xex: tweak docs.
Diffstat (limited to 'unprotbas.rst')
-rw-r--r-- | unprotbas.rst | 156 |
1 files changed, 156 insertions, 0 deletions
diff --git a/unprotbas.rst b/unprotbas.rst new file mode 100644 index 0000000..735681e --- /dev/null +++ b/unprotbas.rst @@ -0,0 +1,156 @@ +========= +unprotbas +========= + +--------------------------------------------------- +Unprotect LIST-protected Atari 8-bit BASIC programs +--------------------------------------------------- + +.. include:: manhdr.rst + +SYNOPSIS +======== + +unprotbas [**-v**] [**-f**] [**-n**] [**-g**] **input-file** **output-file** + +DESCRIPTION +=========== + +**unprotbas** modifies LIST-protected Atari 8-bit BASIC programs, +creating a new non-protected copy. See **DETAILS**, below, to +understand how the protection and unprotection works. + +**input-file** must be a tokenized Atari BASIC program. Use *-* to +read from standard input. + +**output-file** will be the unprotected tokenized BASIC program. If it +already exists, it will be overwritten. Use *-* to write to standard +output, but **[TODO]** **unprotbas** will refuse to write to standard +output if it's a terminal (since tokenized BASIC is binary data and +may confuse the terminal). + +OPTIONS +======= + +**-v** + Verbose operation. + +**-f** + Force the variable name table to be rebuilt, even if it looks OK. + +**-n** + Don't rebuild the variable table (only fix the line pointers, if + needed). + +**-g** + Remove any "garbage" data from the end of the file. By default, + it's left as-is, in case it's actually data used by the program. + +EXIT STATUS +=========== + +Exit status is zero for success, non-zero for failure. + +DETAILS +======= + +In the Atari BASIC world, it's possible to create a SAVEd (tokenized) +program that can be RUN from disk (**RUN "D:FILE.BAS"**) but if +it's LOADed, it will either crash the BASIC interpreter, or LIST +as gibberish. This is known as LIST-protection. Such programs are +generally released to the world in protected form; the author +privately keeps an unprotected copy so he can modify it. In +later days, collections such as the Holmes Archive contain many +LIST-protected programs, for which the unprotected version was never +released. + +One example of LIST-protection, taken from *Mapping the Atari* (the +**STMCUR** entry in the memory map) looks like:: + + 32000 FOR VARI=PEEK(130)+PEEK(131)*256 TO PEEK(132)+PEEK(133)*256:POKE VARI,155:NEXT VARI + 32100 POKE PEEK(138)+PEEK(139)*256+2,0:SAVE "D:filename":NEW + +To use, add the 2 lines of code to your program, then execute them +with **GOTO 32000** in immediate mode. + +This illustrates both types of protection, which can be (and usually +are) applied to the same program: + +Variable name table scrambling + BASIC has specific rules on what are and aren't considered legal + variable names, which are enforced by the tokenization process, + at program entry time. However, it doesn't use the variable names + at runtime, when the tokenized file is interpreted. + + Replacing the variable names with binary gibberish will render the + program LIST-proof, either replacing every variable name with the + same control character, or causing LIST to display a long string of + binary garbage for each variable name... but the program will still + RUN correctly. Note that the original variable names are *gone*, + and cannot be recovered. + + Line 32000 in the example above does this job, replacing every + variable name with the EOL character (155). + + **unprotbas** detects a scrambled variable name table, and builds + a new one that's valid. However, since there are no real variable + names in the program, the recovery process just invents new ones, + named A through Z, A1 through A9, B1 through B9, etc, etc. It'll + require human intelligence to figure out what each variable is for, + since the names are meaningless. + + The **output-file** may be larger than the **input-file** was, since + some types of variable-name scrambling shrink the variable name + table to the minimum size (one byte per name); the rebuilt table + will be larger. + +Bad next-line pointer + Generally, this is done with line number 32768. Yes, this line + number is outside the range BASIC accepts... but BASIC uses it + internally for immediate-mode commands. And when SAVE or CSAVE are + executed, this line gets saved, too. + + Every line of tokenized BASIC contains a line length byte, which + BASIC uses as a pointer to the next line of code. Before printing + the READY prompt, BASIC iterates over every line of code in the + program, using the next-line pointers, in order to delete any + existing line 32768 (the previous immediate mode command). If any + line's pointer is set to zero, that means it points to itself. + + When BASIC tries to traverse a line of code that points to itself as + "next" line, it will get stuck in an infinite loop. This not only + prevents LIST, it actually prevents any immediate mode command: + after LOADing such a file, *nothing* will work (even pressing RESET + won't get you out of it). The only way to use such a program is to + use the RUN command with a filename, and if the program ever exits + (due to END, STOP, an error, or the Break key), BASIC will get stuck + again. + + This doesn't *have* to be done with line 32768. Any line of code + that doesn't have to be traversed at runtime would work (in other + words, a regular line whose line number is higher than any code that + ever gets executed, usually the last line in the file). + + Line 32100 in the example above does this job, taking advantage of + the STMCUR pointer used by BASIC, which holds the address of the + line of tokenized code currently being executed. + + **unprotbas** fixes this simply by calculating what the pointer + should be (based on the tokens in the line) and changing it. No + information is lost by doing this. + +One more thing **unprotbas** can do is remove extra data from the end +of the file. It's possible for BASIC files to contain extra data that +occurs after the end of the program. Some programs use this as a way +to load arbitrary binary data into memory along with the program; for +other programs, the extra data is truly garbage (e.g. an EOF character +if the file came from a CP/M system, or padding to a block size if a +dumb implementation of XMODEM was used to transfer the file). + +Normally, such "garbage" doesn't hurt anything. BASIC ignores it. Or +it normally does... if you suspect it's causing a problem, you can +remove it with the **-g** option. If removing the "garbage" causes the +program to fail to run, it wasn't garbage! **unprotbas** doesn't +remove extra data by default, to be on the safe side. + +.. include:: manftr.rst |