Atari Microsoft BASIC Notes --------------------------- AMSB is actually a pretty cool BASIC for the Atari 8-bit. I never got the chance to use it 'back in the day' because it was expensive, required a floppy drive and at least 32K of RAM (my poor 400 had a tape drive for the first few years), and then later on, there was Turbo BASIC XL, which was just as cool as AMSB, but freeware. This file is a collection of notes I made to myself while developing listamsb. The information here might be useful (e.g. if you're trying to repair a damaged AMSB file) and hopefully is interesting. Enjoy! -- B. Watson Tokenized file format --------------------- File begins with a 3-byte header: offset | purpose -------+------------------------------------------------------- 0 | 0 for a normal program, 1 for LOCKed (encrypted). 1 | LSB, program length, not counting the 3-byte header... 2 | MSB, " " The program length should always be the actual file size minus 3. If it's not, the file has either been truncated or had junk added to the end. In a LOCKed program, the program length bytes are not encrypted. After the header, the lines of code (encrypted, for LOCKed programs). Each line has a 4-byte header: offset | purpose -------+------------------------------------------------------- 0 | LSB, address of the last byte of this line... 1 | MSB, address ...which is ignored on LOAD! 2 | LSB of line number 3 | MSB " " " The rest of the line is the tokens, terminated by a $00 byte. The next 2 bytes after the $00 is the last-byte offset of the next line. The last "line" of the program has a $0000 offset, which indicates the end of the program. Since the actual last line ends with a $00, that means there will be three $00 bytes in a row as the last 3 bytes of the file. And that's the *only* place 3 $00's in a row will occur. Tokenization is "lightweight": there are no tokenized numerics, they're just stored as ASCII characters, as typed. There's no "string constant follows" token like there is in Atari BASIC (well, there is, it's just a double-quote, $22. There's no length byte). Variable names are not tokenized, either, they're just stored as-is (name in ASCII, including trailing $ for strings, etc). Numeric constants are just stored as ASCII digits, just as you typed them. In fact the only things that are tokenized are BASIC keywords: commands and functions... NOT including user functions defined with DEF (those are stored as just the ASCII function name, like variables). There are 2 sets of tokens. One set is single-byte, $80 and up. These are commands. The other set is functions, which are 2 bytes: $FF followed by the token number. See amsbtok.h in the source for the actual tokens. AMSB saves the end-of-line pointers, but it totally ignores them on LOAD. The SAVEd file format does *not* have a load address (as e.g. Commodore BASIC does), so there's no way to know the address of the start of the program (other than counting backwards from the next line, since its address is known). It's not just a constant either: it depends on what MEMLO was set to when the program was saved (which varies depending on what version of AMSB you have, what DOS you boot, whether or not you have the R: device driver loaded, etc etc). LOADing Untokenized Files ------------------------- If the first byte of the file is anything other than $00 or $01, AMSB's LOAD command reads it in as a text file (LISTed rather than SAVEd). When LOAD is reading a text file, if the last byte of the file isn't an ATASCII EOL ($9b), you'll get #136 ERROR. The program doesn't get deleted, but the last line of the file didn't get loaded. This could happen if a LISTed file somehow got truncated. While on the subject... the manual doesn't mention it, but if you LOAD a text file without line numbers, the code gets executed in direct mode during the load (like Atari BASIC's ENTER command does). This means you could write scripts (batch files) for AMSB. Program Length Header Mismatch ------------------------------ When AMSB's LOAD command executes, it reads the 3-byte header, then reads as many bytes as the header's program length says. If the header length is longer than the rest of the file, you get a #136 ERROR (aka Atari's EOF), and the partially loaded program is erased (basically it does a NEW). If the length is shorter than the program, it'll stop loading no matter how much more data is in the file. This means it can stop in the middle of a line. It also means, if there was already a program in memory that was longer than the program length, you get a "hybrid" mix of the new program followed by the remainder of the old one. This is because the three $00 bytes at the end of the program weren't read in. If the program length is correct for the actual program (so the three $00 bytes get read), but there's extra data appended to the file, AMSB will never read the extra data at all. String Limitations ------------------ String literals in AMSB cannot contain the | or ATASCII heart characters. AMSB uses | as a terminator for quoted strings, e.g. "STRING" will be tokenized as: "STRING| If you try to use a | in a quoted string, it gets turned into a double quote: "FOO|BAR" comes out as "FOO"BAR which is a syntax error! String variables can store | but only with e.g. CHR$(124) or reading from a file: it's string *literals* that don't allow it. The reason | is used for a terminating quote is to allow doubling up the quotes to embed them in a string: A$ = "HAS ""QUOTES""" PRINT A$ will print: HAS "QUOTES" At first I thought "no pipe characters in strings, WTF man?" but it's probably no worse than Atari BASIC's "no quotes in strings constants" rule. It *would* be nice if the AMSB manual actually documented the fact that | can't occur in a string constant. Not documenting it makes it a bug... and they have unused tokens in the $Fx range, I don't see why they had to use a printing character for this. You also can't put a heart (ATASCII character 0) in a string literal. It will be treated as the end of the line, as though you pressed Enter (and anything else on the line is ignored). This isn't documented in the manual, either. Like the | character, you can use CHR$(0) to store a heart in a string and it will work correctly. Differences Between Versions ---------------------------- The language is the same in AMSB versions 1 and 2. Tokenized files made by one version will LOAD and RUN in the other version. Version 1, the disk version, always has the full set of commands avaiable. Version 2, the cart, only has the full set if the extension disk is booted. The missing ones still get tokenized, but you get SN ERROR at runtime if you try to execute them. This doesn't affect the detokenizer at all. The missing commands: AUTO DEF (the string version; numeric is still present) NOTE RENUM TRON TROFF DEL USING STRING$ (function, not a command) RENUM only works in direct mode, not a program. Executing it gives a FUNCTION CALL ERROR. AUTO is (oddly) allowed in a program. Executing it exits the program and puts you back in the editor, in auto-numbering mode. It would seem weird to have POINT available but not NOTE... except that AMSB doesn't even *have* POINT. Instead, the disk addresses returned by NOTE are used with AT() in a PRINT statement. Not sure if AT() works without the extensions loaded, but it won't be useful anyway without NOTE. One other difference between versions 1 and 2: version 2 will LOAD and RUN the file D:AUTORUN.AMB at startup, if it exists. Colon Weirdness --------------- AMSB allows comments to be started with the ! and ' characters (as well as the traditional REM). For the ! and ' variety, if they come at the end of a line after some code, you don't have to put a colon. Example: 10 GRAPHICS 2+16 ! NO TEXT WINDOW However... in the tokenized format, there *is* a tokenized colon just before the tokenized ! or ' character. LIST doesn't display it. If you did put a colon: 10 CLOSE #1:! WE'RE DONE WITH THE FILE ...then there will be *two* colons in the tokenized file, and only one will be LISTed. The ELSE keyword works the same way. In this line: 10 IF A THEN PRINT ELSE STOP ...there is actually a : character just before the token for ELSE. Even weirder: you can put as many colons in a row as you like, and AMSB will treat it like single colon. This line of code is valid and runs correctly: 10 PRINT "FOO"::::::PRINT "BAR" These colons are displayed normally in LIST output. Memory Usage ------------ On a 48K/64K Atari, FRE(0) for AMSB 1 with DOS booted (since you can't use it without) but no device drivers is 21020. MEMLO is awfully high ($6a00). For AMSB 2 with DOS booted, but without the extensions loaded, FRE(0) is 24352. With extensions it's 20642 (even though the banner says 20644 BYTES FREE). AMSB 2 without DOS gives you 29980, but how are you gonna load or save programs without DOS? Nobody wants to use cassette, especially not people who could afford to buy the AMSB II cartridge. LOCKed Programs --------------- If you save a program with SAVE "filename" LOCK, it gets saved in an "encrypted" form. Loading a locked program disbles the LISTing or editing the program (you get LK ERROR if you try). The "encryption" is no better than ROT13. To encrypt, subtract each byte from 0x54 (in an 8-bit register, using twos complement). To decrypt, do the same. You can tell a LOCKed program because its first byte will be 1 instead of 0. The next 2 bytes (the program length) unencrypted. The rest of the file is encrypted with the lame scheme described above. When AMSB has a LOCKed program loaded into memory, it's *not* stored encrypted in RAM. It would be perfectly possible to write BASIC code using direct mode to write the tokenized program out to disk. The program starts at MEMLO and extends up to the first occurrence of three $00 bytes. The hardest part of this would be generating the header using only direct-mode BASIC statements (but it could be done). However... there's no need to do that. AMSB has a flag that tells it whether or not the currently-loaded program is LOCKed. You can just clear the flag: POKE 168,0 Now AMSB won't consider the program LOCKed, and you can SAVE a regular copy of it (and LIST, edit, etc). Line Length Limit ----------------- In the editor, after a POKE 82,0 (to set the left margin to 0), you can enter 120 characters (3 screen lines) on a logical line. If you enter a program line that way *without* a space after the line number, then LIST it, it will be 121 characters long, because AMSB will display a space after the line number. If you use a text editor (or write a program) to create an untokenized BASIC program, you can have a line of code that's 125 characters long. AMSB will accept it just fine, with LOAD. If a line is 126 characters or longer, AMSB will silently ignore that line when LOADing. If you create a 125-character line (with a text editor) consisting only of a comment that begins with ! or ', without a space after the line number, LOAD it, then SAVE it, that line will be 129 bytes long in tokenized form. AMSB will LOAD it with no problems. If you hex-edit a SAVEd file to create a longer line, AMSB will accept that, too... up to 255 bytes. At 256 bytes, AMSB will lock up after LOAD.