From 028b2a57446789fe7bafb8ab678c1f0f43b34645 Mon Sep 17 00:00:00 2001 From: "B. Watson" Date: Fri, 7 Jun 2024 04:47:04 -0400 Subject: dumpbas: added. --- dumpbas.rst | 163 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 163 insertions(+) create mode 100644 dumpbas.rst (limited to 'dumpbas.rst') diff --git a/dumpbas.rst b/dumpbas.rst new file mode 100644 index 0000000..8b116c3 --- /dev/null +++ b/dumpbas.rst @@ -0,0 +1,163 @@ +======= +dumpbas +======= + +------------------------------------------------------- +Formatted hexdump for tokenized Atari 8-bit BASIC files +------------------------------------------------------- + +.. include:: manhdr.rst + +SYNOPSIS +======== +dumpbas [**-v**] [**-l** *lineno*] [**-s** *start-lineno*] [**-e** *end-lineno*] *input-file* + +DESCRIPTION +=========== +**dumpbas** reads a tokenized Atari 8-bit BASIC program and prints a +formatted hexdump on standard output. The formatting groups the hex bytes +by line and statement, and includes special characters to mark different +types of token (see **FORMATTING**, below). + +**dumpbas** does not detokenize BASIC programs or dump information +about variable names/values. Use **chkbas**\(1) for that. This tool is +intended to help the user learn about the tokenized BASIC format, or +as an aid for developing/debugging other tools that process tokenized +files. It's an alternative to looking at raw hex dumps. + +It's assumed the user has at least some knowledge of BASIC's tokenized +SAVE format. The **Atari BASIC Sourcebook** is a good starting point +for learning the tokenized format. + +OPTIONS +======= + +General Options +--------------- +**--help** + Print usage message and exit. + +**--version** + Print version number and exit. + +**-v** + Verbose operation. When displaying a number in verbose mode, it will + be prefixed with *$* if it's in hex, or no prefix for decimal. + +Dump Options +------------ +**-s** *start-lineno* + Don't dump lines before **start-lineno**. Default: *0*. + +**-e** *end-lineno* + Don't dump lines after **start-lineno**. Default: *32768*. + +**-l** *lineno* + Only dump one line. This is exactly equivalent to "**-s** *num* **-e** *num*". + +FORMATTING +========== +Every byte in the file is displayed in hex. However, they are grouped by line +and statement, and certain tokens get marker characters to help keep track +of what they're for. Strings are displayed in both hex and ASCII. Floating +point constants are displayed as 6 hex bytes with square brackets around them. + +If **dumpbas** is run on the following program:: + + 10 ? "HOW MANY TIMES";:INPUT N + 20 FOR I=1 TO N + 30 ? "HELLO ";:? I;"/";N:NEXT I + 40 REM WAIT FOR KEY + 50 POKE 764,255 + 60 ? "PRESS ANY KEY" + 70 IF PEEK(764)=255 THEN 70 + 80 POKE 764,255:GOTO 10 + +**Note:** The "PRESS ANY KEY" was entered in inverse video. + +...it produces the following output:: + + 10@0021 (0a 00): ^1b + >17 !28 $0f =0e "H/48 O/4f W/57 /20 M/4d A/41 N/4e Y/59 /20 T/54 I/49 M/4d E/45 S/53" 15 14: + >1b !02 80 16 + 20@003c (14 00): ^11 + >11 !08 81 2d #0e [40 01 00 00 00 00] 19 80 16 + 30@004d (1e 00): ^1d + >0f !28 $0f =06 "H/48 E/45 L/4c L/4c O/4f /20" 15 14: + >19 !28 81 15 $0f =01 "//2f" 15 80 14: + >1d !09 81 16 + 40@006a (28 00): ^12 + >12 !00 57 41 49 54 20 46 4f 52 20 4b 45 59 9b + 50@007c (32 00): ^15 + >15 !1f #0e [41 07 64 00 00 00] 12 #0e [41 02 55 00 00 00] 16 + 60@0091 (3c 00): ^15 + >15 !28 $0f =0d "|P/d0 |R/d2 |E/c5 |S/d3 |S/d3 | /a0 |A/c1 |N/ce |Y/d9 | /a0 |K/cb |E/c5 |Y/d9" 16 + 70@00a6 (46 00): ^20 + >20 !07 46 3a #0e [41 07 64 00 00 00] 2c 22 #0e [41 02 55 00 00 00] 1b #0e [40 70 00 00 00 00] 16 + 80@00c6 (50 00): ^1f + >15 !1f #0e [41 07 64 00 00 00] 12 #0e [41 02 55 00 00 00] 14: + >1f !0a #0e [40 10 00 00 00 00] 16 + 32768@00e5 (00 80): ^0f + >0f !19 $0f =07 "H/48 :/3a B/42 ./2e B/42 A/41 S/53" 16 + +Line header +----------- +Each line number begins with the line number (decimal) and offset from +the start of the file (hex), followed by the 2 hex bytes for the line +number in parentheses, followed by the line length (hex, preceded by +^). From the example:: + + 10@0021 (0a 00): ^1b + +The line number is *10*, the file offset is *0021*. The *0a 00* are 10 again, in +hex, LSB first. The *^1b* is the line length. + +Statements +---------- +Each statement within the line is displayed separately. Line 10's first statement:: + + >17 !28 $0f =0e "H/48 O/4f W/57 /20 M/4d A/41 N/4e Y/59 /20 T/54 I/49 M/4d E/45 S/53" 15 14: + +This looks cryptic, but it includes a lot of information. + +- *>* is the marker for the statement offset (*17*). + +- *!* marks a command token (unmarked tokens are operator + tokens). *28* is the token for **?** (short form of PRINT, which has a + separate token). + +- *$* marks the string-constant token (*0f*). + +- *=* marks the string length byte (*0e*). + +- The string itself is printed inside double quotes, with each character in + both ASCII and hex (e.g. *H/48*). + +- The *15* is unmarked. It's the semicolon after the string. + +- There's a *:* at the end of the line (after the *14*, which is the end-of-statement + token). + +Line 10's second statement:: + + >1b !02 80 16 + +The *80* is a token for a variable (variable tokens always have bit 7 set, so they're +always >= 80 hex). The *16* is the end-of-line token. + +Line 20's first statement has an example of a floating point constant:: + + #0e [40 01 00 00 00 00] + +- *#* marks the token for a FP constant. + +- The actual 6-byte constant is surrounded with *[* and *]*. + +- The last token is *16*, which is BASIC's end-of-line token. + +EXIT STATUS +=========== + +0 for success, 1 for failure. + +.. include:: manftr.rst -- cgit v1.2.3