========= unprotbas ========= ------------------------------------------------------------- Unprotect or create LIST-protected Atari 8-bit BASIC programs ------------------------------------------------------------- .. include:: manhdr.rst SYNOPSIS ======== unprotbas [**-v**] [ [**-f**] [**-n**] [**-g**] [**-c**] [**-r** | **-w**] ] | [ [**-p** | **-pc** | **-pv**] ] **input-file** **output-file** DESCRIPTION =========== **unprotbas** modifies a LIST-protected Atari 8-bit BASIC program, creating a new non-protected copy. See **DETAILS**, below, to understand how the protection and unprotection works. It's also capable of LIST-protecting an unprotected program. **input-file** must be a tokenized (SAVEd) Atari BASIC program. Use *-* to read from standard input. **output-file** will be the unprotected tokenized BASIC program. If it already exists, it will be overwritten. Use *-* to write to standard output, but **unprotbas** will refuse to write to standard output if it's a terminal (since tokenized BASIC is binary data and may confuse the terminal). OPTIONS ======= Option bundling is not supported, use e.g. **-v -f**, not **-vf**. To use filenames beginning with *-*, write them as *./-file*, or they will be treated as options. **-v** Verbose operation. **-f** Force the variable name table to be rebuilt, even if it looks OK. This option cannot be combined with **-n**. **-n** Don't rebuild the variable table (only fix the line pointers, if needed). This option cannot be combined with **-f**. **-g** Remove any "garbage" data from the end of the file. By default, it's left as-is, in case it's actually data used by the program. **-c** Check only. Does a dry run. Loads the program, unprotects it in memory, but doesn't write the result anywhere. In this mode, there is no **output-file**. **-w** Write the variable names to **varnames.txt**, one per line. This can be edited, and later used with **-r** to set the variable names to something sensible rather than A, B, C, etc. For an unprotected program, you can use **-n** to write the existing names rather than generating new ones. See **VARIABLE NAMES**, below. **-r** Read variable names from **varnames.txt**, and use them instead of generating the names. See **VARIABLE NAMES**, below. **-p**, **-pc**, **-pv** LIST-protect the program, rather than unprotecting it. **-pc** sets an invalid (0) next-line pointer on the last line of code. **-pv** replaces the variable names with the Atari EOL character (**$9B**). **-p** does both. None of the other options except **-v** (verbose) can be used with these. EXIT STATUS =========== 0 **input-file** was protected, unprotection was successful. 1 I/O error, or **input-file** isn't a valid BASIC program. 2 **input-file** is already an unprotected BASIC program. DETAILS ======= In the Atari BASIC world, it's possible to create a SAVEd (tokenized) program that can be RUN from disk (**RUN "D:FILE.BAS"**) but if it's LOADed, it will either crash the BASIC interpreter, or LIST as gibberish. This is known as LIST-protection. Such programs are generally released to the world in protected form; the author privately keeps an unprotected copy so he can modify it. In later days, collections such as the Holmes Archive contain many LIST-protected programs, for which the unprotected version was never released. One example of LIST-protection, taken from *Mapping the Atari* (the **STMCUR** entry in the memory map) looks like:: 32000 FOR VARI=PEEK(130)+PEEK(131)*256 TO PEEK(132)+PEEK(133)*256:POKE VARI,155:NEXT VARI 32100 POKE PEEK(138)+PEEK(139)*256+2,0:SAVE "D:filename":NEW To use, add the 2 lines of code to your program, then execute them with **GOTO 32000** in immediate mode. This illustrates both types of protection, which can be (and usually are) applied to the same program: Variable name table scrambling BASIC has specific rules on what are and aren't considered legal variable names, which are enforced by the tokenization process, at program entry time. However, it doesn't use the variable names at runtime, when the tokenized file is interpreted. Replacing the variable names with binary gibberish will render the program LIST-proof, either replacing every variable name with the same control character, or causing LIST to display a long string of binary garbage for each variable name... but the program will still RUN correctly. Note that the original variable names are *gone*, and cannot be recovered. Line 32000 in the example above does this job, replacing every variable name with the EOL character (155). **unprotbas** detects a scrambled variable name table, and builds a new one that's valid. However, since there are no real variable names in the program, the recovery process just invents new ones, named A through Z, A1 through A9, B1 through B9, etc, etc. It'll require human intelligence to figure out what each variable is for, since the names are meaningless. The **output-file** may not be the exact size that the **input-file** was. Some types of variable-name scrambling shrink the variable name table to the minimum size (one byte per name), so the rebuilt table will be larger. Other types of scrambling leave the variable name table at its original size, but **unprotbas** generates only one- and two-character variable names, so the rebuilt table might be smaller. The program **PROTECT.BAS**, found on Disk 2 of the Holmes Archive, creates protected BASIC programs that only use variable name scrambling. Bad next-line pointer Every line of tokenized BASIC contains a line length byte, which BASIC uses as a pointer to the next line of code. Before printing the READY prompt, BASIC iterates over every line of code in the program, using the next-line pointers, in order to delete any existing line 32768 (the previous immediate mode command). If any line's pointer is set to zero, that means it points to itself. When BASIC tries to traverse a line of code that points to itself as "next" line, it will get stuck in an infinite loop. This not only prevents LIST, it actually prevents any immediate mode command: after LOADing such a file, *nothing* will work (even pressing RESET won't get you out of it). The only way to use such a program is to use the RUN command with a filename, and if the program ever exits (due to END, STOP, an error, or the Break key), BASIC will get stuck again. This doesn't *have* to be done with the last line in the program. The "poisoned" line could be followed by more lines of code, though they could never actually execute. Line 32100 in the example above does this job, taking advantage of the STMCUR pointer used by BASIC, which holds the address of the line of tokenized code currently being executed. **unprotbas** fixes this simply by calculating what the pointer should be (based on the tokens in the line) and changing it. No information is lost by doing this. The program **UNPROTEC**, from the *Pirate's Treasure Chest*, can fix bad pointers in protected programs, though it doesn't do anything about variable name scrambling. One more thing **unprotbas** can do is remove extra data from the end of the file. It's possible for BASIC files to contain extra data that occurs after the end of the program. Some programs use this as a way to load arbitrary binary data into memory along with the program; for other programs, the extra data is truly garbage (e.g. an EOF character if the file came from a CP/M system, or padding to a block size if a dumb implementation of XMODEM was used to transfer the file). Normally, such "garbage" doesn't hurt anything. BASIC ignores it. Or it normally does... if you suspect it's causing a problem, you can remove it with the **-g** option. If removing the "garbage" causes the program to fail to run, it wasn't garbage! **unprotbas** doesn't remove extra data by default, to be on the safe side. VARIABLE NAMES ============== If variable name scrambling was used, the original variable names no longer exist. **unprotbas** will generate them, according to these rules: The first 26 numeric variables will be called *A* through *Z*. Further numeric variables will be *A1* through *A9*, *B1* through *B9*, etc. The first 26 string variables will be *A$* to *Z$*, then *A1$* to *A9$*, *B1$* to *B9$*, etc. The first 26 array variables will be *A(* to *Z(*, then *A1(* to *A9(*, *B1(* to *B9(*, etc. To properly reverse-engineer the protected program, it's necessary to assign meaningful variable names. **unprotbas** isn't smart enough to do this for you, but it can semi-automate the process. First, run **unprotbas** with the **-w** option. This will create a file called **varnames.txt**, containing the generated variable names. These are in order, one line per variable name, ending with *$* for strings and the *(* for arrays. Load the unprotected program on the Atari and LIST it (or use **chkbas** to get a listing), and edit **varnames.txt** in a text editor. As you figure out what each variable's purpose is, change its name in the text file. When editing the file: - Don't add or delete any lines. - Don't get rid of the *$* or *(* at the end of any line. - You may enter the names in lowercase (**unprotbas** will convert them to uppercase). - Remember to follow the rules for BASIC variable names: The first character must be a letter, other characters must be a letter or a number, and only the last character can be *$* or *(*. - No duplicates of the same type are allowed (you can have *FOO* and *FOO$*, but not two numerics called *FOO*). When you're finished, re-run **unprotbas**, this time with the **-r** option. If all is well, the unprotected program will use your variable names, rather than generating new ones. If you broke the rules, you should get an informative error message explaining what and where the problem is. This process can also be used for regular unprotected programs. Use **-n -w** the first time, to save the existing variable names to **varnames.txt** rather than generating new ones. NOTES ===== Atari BASIC has a limit of 128 variables in a program. It's actually possible for the variable name table to contain up to 256 variables, though the 129th and further ones won't be usable in the program. The variable value table can hold more than 256 values, though the variable numbers wrap around once they pass 255. The attempt to add variables past the 128th causes BASIC to respond with *ERROR- 4*, but the variable does get added to the tables. **unprotbas** will preserve these extra (useless) entries in the tables, though it will complain "Warning: variable #XXX value is corrupt" for value table entries 256 and up. This is a pathological case, and shouldn't happen with programs that aren't deliberately written to test this behaviour. .. include:: manftr.rst