aboutsummaryrefslogtreecommitdiff
path: root/dumpbas.rst
diff options
context:
space:
mode:
Diffstat (limited to 'dumpbas.rst')
-rw-r--r--dumpbas.rst163
1 files changed, 163 insertions, 0 deletions
diff --git a/dumpbas.rst b/dumpbas.rst
new file mode 100644
index 0000000..8b116c3
--- /dev/null
+++ b/dumpbas.rst
@@ -0,0 +1,163 @@
+=======
+dumpbas
+=======
+
+-------------------------------------------------------
+Formatted hexdump for tokenized Atari 8-bit BASIC files
+-------------------------------------------------------
+
+.. include:: manhdr.rst
+
+SYNOPSIS
+========
+dumpbas [**-v**] [**-l** *lineno*] [**-s** *start-lineno*] [**-e** *end-lineno*] *input-file*
+
+DESCRIPTION
+===========
+**dumpbas** reads a tokenized Atari 8-bit BASIC program and prints a
+formatted hexdump on standard output. The formatting groups the hex bytes
+by line and statement, and includes special characters to mark different
+types of token (see **FORMATTING**, below).
+
+**dumpbas** does not detokenize BASIC programs or dump information
+about variable names/values. Use **chkbas**\(1) for that. This tool is
+intended to help the user learn about the tokenized BASIC format, or
+as an aid for developing/debugging other tools that process tokenized
+files. It's an alternative to looking at raw hex dumps.
+
+It's assumed the user has at least some knowledge of BASIC's tokenized
+SAVE format. The **Atari BASIC Sourcebook** is a good starting point
+for learning the tokenized format.
+
+OPTIONS
+=======
+
+General Options
+---------------
+**--help**
+ Print usage message and exit.
+
+**--version**
+ Print version number and exit.
+
+**-v**
+ Verbose operation. When displaying a number in verbose mode, it will
+ be prefixed with *$* if it's in hex, or no prefix for decimal.
+
+Dump Options
+------------
+**-s** *start-lineno*
+ Don't dump lines before **start-lineno**. Default: *0*.
+
+**-e** *end-lineno*
+ Don't dump lines after **start-lineno**. Default: *32768*.
+
+**-l** *lineno*
+ Only dump one line. This is exactly equivalent to "**-s** *num* **-e** *num*".
+
+FORMATTING
+==========
+Every byte in the file is displayed in hex. However, they are grouped by line
+and statement, and certain tokens get marker characters to help keep track
+of what they're for. Strings are displayed in both hex and ASCII. Floating
+point constants are displayed as 6 hex bytes with square brackets around them.
+
+If **dumpbas** is run on the following program::
+
+ 10 ? "HOW MANY TIMES";:INPUT N
+ 20 FOR I=1 TO N
+ 30 ? "HELLO ";:? I;"/";N:NEXT I
+ 40 REM WAIT FOR KEY
+ 50 POKE 764,255
+ 60 ? "PRESS ANY KEY"
+ 70 IF PEEK(764)=255 THEN 70
+ 80 POKE 764,255:GOTO 10
+
+**Note:** The "PRESS ANY KEY" was entered in inverse video.
+
+...it produces the following output::
+
+ 10@0021 (0a 00): ^1b
+ >17 !28 $0f =0e "H/48 O/4f W/57 /20 M/4d A/41 N/4e Y/59 /20 T/54 I/49 M/4d E/45 S/53" 15 14:
+ >1b !02 80 16
+ 20@003c (14 00): ^11
+ >11 !08 81 2d #0e [40 01 00 00 00 00] 19 80 16
+ 30@004d (1e 00): ^1d
+ >0f !28 $0f =06 "H/48 E/45 L/4c L/4c O/4f /20" 15 14:
+ >19 !28 81 15 $0f =01 "//2f" 15 80 14:
+ >1d !09 81 16
+ 40@006a (28 00): ^12
+ >12 !00 57 41 49 54 20 46 4f 52 20 4b 45 59 9b
+ 50@007c (32 00): ^15
+ >15 !1f #0e [41 07 64 00 00 00] 12 #0e [41 02 55 00 00 00] 16
+ 60@0091 (3c 00): ^15
+ >15 !28 $0f =0d "|P/d0 |R/d2 |E/c5 |S/d3 |S/d3 | /a0 |A/c1 |N/ce |Y/d9 | /a0 |K/cb |E/c5 |Y/d9" 16
+ 70@00a6 (46 00): ^20
+ >20 !07 46 3a #0e [41 07 64 00 00 00] 2c 22 #0e [41 02 55 00 00 00] 1b #0e [40 70 00 00 00 00] 16
+ 80@00c6 (50 00): ^1f
+ >15 !1f #0e [41 07 64 00 00 00] 12 #0e [41 02 55 00 00 00] 14:
+ >1f !0a #0e [40 10 00 00 00 00] 16
+ 32768@00e5 (00 80): ^0f
+ >0f !19 $0f =07 "H/48 :/3a B/42 ./2e B/42 A/41 S/53" 16
+
+Line header
+-----------
+Each line number begins with the line number (decimal) and offset from
+the start of the file (hex), followed by the 2 hex bytes for the line
+number in parentheses, followed by the line length (hex, preceded by
+^). From the example::
+
+ 10@0021 (0a 00): ^1b
+
+The line number is *10*, the file offset is *0021*. The *0a 00* are 10 again, in
+hex, LSB first. The *^1b* is the line length.
+
+Statements
+----------
+Each statement within the line is displayed separately. Line 10's first statement::
+
+ >17 !28 $0f =0e "H/48 O/4f W/57 /20 M/4d A/41 N/4e Y/59 /20 T/54 I/49 M/4d E/45 S/53" 15 14:
+
+This looks cryptic, but it includes a lot of information.
+
+- *>* is the marker for the statement offset (*17*).
+
+- *!* marks a command token (unmarked tokens are operator
+ tokens). *28* is the token for **?** (short form of PRINT, which has a
+ separate token).
+
+- *$* marks the string-constant token (*0f*).
+
+- *=* marks the string length byte (*0e*).
+
+- The string itself is printed inside double quotes, with each character in
+ both ASCII and hex (e.g. *H/48*).
+
+- The *15* is unmarked. It's the semicolon after the string.
+
+- There's a *:* at the end of the line (after the *14*, which is the end-of-statement
+ token).
+
+Line 10's second statement::
+
+ >1b !02 80 16
+
+The *80* is a token for a variable (variable tokens always have bit 7 set, so they're
+always >= 80 hex). The *16* is the end-of-line token.
+
+Line 20's first statement has an example of a floating point constant::
+
+ #0e [40 01 00 00 00 00]
+
+- *#* marks the token for a FP constant.
+
+- The actual 6-byte constant is surrounded with *[* and *]*.
+
+- The last token is *16*, which is BASIC's end-of-line token.
+
+EXIT STATUS
+===========
+
+0 for success, 1 for failure.
+
+.. include:: manftr.rst