========= unprotbas ========= --------------------------------------------------- Unprotect LIST-protected Atari 8-bit BASIC programs --------------------------------------------------- .. include:: manhdr.rst SYNOPSIS ======== unprotbas [**-v**] [**-f**] [**-n**] [**-g**] **input-file** **output-file** DESCRIPTION =========== **unprotbas** modifies LIST-protected Atari 8-bit BASIC programs, creating a new non-protected copy. See **DETAILS**, below, to understand how the protection and unprotection works. **input-file** must be a tokenized Atari BASIC program. Use *-* to read from standard input. **output-file** will be the unprotected tokenized BASIC program. If it already exists, it will be overwritten. Use *-* to write to standard output, but **unprotbas** will refuse to write to standard output if it's a terminal (since tokenized BASIC is binary data and may confuse the terminal). OPTIONS ======= Option bundling is not supported, use e.g. **-v -f**, not **-vf**. To use filenames beginning with *-*, write them as *./-file*, or they will be treated as options. **-v** Verbose operation. TODO: it's always verbose right now... **-f** Force the variable name table to be rebuilt, even if it looks OK. **-n** Don't rebuild the variable table (only fix the line pointers, if needed). **-g** Remove any "garbage" data from the end of the file. By default, it's left as-is, in case it's actually data used by the program. **-c** Check only. Does a dry run. Loads the program, unprotects it in memory, but doesn't write the result anywhere. In this mode, there is no **output-file**. EXIT STATUS =========== 0 **input-file** was protected, unprotection was successful. 1 I/O error, or **input-file** isn't a valid BASIC program. 2 **input-file** is already an unprotected BASIC program. DETAILS ======= In the Atari BASIC world, it's possible to create a SAVEd (tokenized) program that can be RUN from disk (**RUN "D:FILE.BAS"**) but if it's LOADed, it will either crash the BASIC interpreter, or LIST as gibberish. This is known as LIST-protection. Such programs are generally released to the world in protected form; the author privately keeps an unprotected copy so he can modify it. In later days, collections such as the Holmes Archive contain many LIST-protected programs, for which the unprotected version was never released. One example of LIST-protection, taken from *Mapping the Atari* (the **STMCUR** entry in the memory map) looks like:: 32000 FOR VARI=PEEK(130)+PEEK(131)*256 TO PEEK(132)+PEEK(133)*256:POKE VARI,155:NEXT VARI 32100 POKE PEEK(138)+PEEK(139)*256+2,0:SAVE "D:filename":NEW To use, add the 2 lines of code to your program, then execute them with **GOTO 32000** in immediate mode. This illustrates both types of protection, which can be (and usually are) applied to the same program: Variable name table scrambling BASIC has specific rules on what are and aren't considered legal variable names, which are enforced by the tokenization process, at program entry time. However, it doesn't use the variable names at runtime, when the tokenized file is interpreted. Replacing the variable names with binary gibberish will render the program LIST-proof, either replacing every variable name with the same control character, or causing LIST to display a long string of binary garbage for each variable name... but the program will still RUN correctly. Note that the original variable names are *gone*, and cannot be recovered. Line 32000 in the example above does this job, replacing every variable name with the EOL character (155). **unprotbas** detects a scrambled variable name table, and builds a new one that's valid. However, since there are no real variable names in the program, the recovery process just invents new ones, named A through Z, A1 through A9, B1 through B9, etc, etc. It'll require human intelligence to figure out what each variable is for, since the names are meaningless. The **output-file** may not be the exact size that the **input-file** was. Some types of variable-name scrambling shrink the variable name table to the minimum size (one byte per name), so the rebuilt table will be larger. Other types of scrambling leave the variable name table at its original size, but **unprotbas** generates only one- and two-character variable names, so the rebuilt table might be smaller. Bad next-line pointer Generally, this is done with line number 32768. Yes, this line number is outside the range BASIC accepts... but BASIC uses it internally for immediate-mode commands. And when SAVE or CSAVE are executed, this line gets saved, too. Every line of tokenized BASIC contains a line length byte, which BASIC uses as a pointer to the next line of code. Before printing the READY prompt, BASIC iterates over every line of code in the program, using the next-line pointers, in order to delete any existing line 32768 (the previous immediate mode command). If any line's pointer is set to zero, that means it points to itself. When BASIC tries to traverse a line of code that points to itself as "next" line, it will get stuck in an infinite loop. This not only prevents LIST, it actually prevents any immediate mode command: after LOADing such a file, *nothing* will work (even pressing RESET won't get you out of it). The only way to use such a program is to use the RUN command with a filename, and if the program ever exits (due to END, STOP, an error, or the Break key), BASIC will get stuck again. This doesn't *have* to be done with line 32768. Any line of code that doesn't have to be traversed at runtime would work (in other words, a regular line whose line number is higher than any code that ever gets executed, usually the last line in the file). Line 32100 in the example above does this job, taking advantage of the STMCUR pointer used by BASIC, which holds the address of the line of tokenized code currently being executed. **unprotbas** fixes this simply by calculating what the pointer should be (based on the tokens in the line) and changing it. No information is lost by doing this. One more thing **unprotbas** can do is remove extra data from the end of the file. It's possible for BASIC files to contain extra data that occurs after the end of the program. Some programs use this as a way to load arbitrary binary data into memory along with the program; for other programs, the extra data is truly garbage (e.g. an EOF character if the file came from a CP/M system, or padding to a block size if a dumb implementation of XMODEM was used to transfer the file). Normally, such "garbage" doesn't hurt anything. BASIC ignores it. Or it normally does... if you suspect it's causing a problem, you can remove it with the **-g** option. If removing the "garbage" causes the program to fail to run, it wasn't garbage! **unprotbas** doesn't remove extra data by default, to be on the safe side. .. include:: manftr.rst