da65 Users Guide

Ullrich von Bassewitz,
Greg King

2014-11-23

  ----------------------------------------------------------------------------

da65 is a 6502/65C02 disassembler that is able to read user-supplied information
about its input data, for better results. The output is ready for feeding into
ca65, the macro assembler supplied with the cc65 C compiler.

  ----------------------------------------------------------------------------

1. Overview

2. Usage

  * 2.1 Command line option overview
  * 2.2 Command line options in detail

3. Detailed workings

  * 3.1 Supported CPUs
  * 3.2 Attribute map
  * 3.3 Labels
  * 3.4 Info File

4. Info File Format

  * 4.1 Comments
  * 4.2 Specifying global options
  * 4.3 Specifying Ranges
  * 4.4 Specifying Labels
  * 4.5 Specifying Segments
  * 4.6 Specifying Assembler Includes
  * 4.7 An Info File Example

5. Copyright

  ----------------------------------------------------------------------------

1. Overview

da65 is a disassembler for 6502/65C02 code. It is supplied as a utility with the
cc65 C compiler and generates output that is suitable for the ca65 macro
assembler.

Besides generating output for ca65, one of the design goals was that the user is
able to feed additional information about the code into the disassembler, for
improved results. This information may include the location and size of tables,
and their format.

One nice advantage of this concept is that disassembly of copyrighted binaries
may be handled without problems: One can just pass the information file for
disassembling the binary, so everyone with a legal copy of the binary can
generate a nicely formatted disassembly with readable labels and other
information.

2. Usage

2.1 Command line option overview

The assembler accepts the following options:

 ---------------------------------------------------------------------------
 Usage: da65 [options] [inputfile]
 Short options:
   -g                    Add debug info to object file
   -h                    Help (this text)
   -i name               Specify an info file
   -o name               Name the output file
   -v                    Increase verbosity
   -F                    Add formfeeds to the output
   -S addr               Set the start/load address
   -V                    Print the disassembler version

 Long options:
   --argument-column n   Specify argument start column
   --comment-column n    Specify comment start column
   --comments n          Set the comment level for the output
   --cpu type            Set cpu type
   --debug-info          Add debug info to object file
   --formfeeds           Add formfeeds to the output
   --help                Help (this text)
   --hexoffs             Use hexadecimal label offsets
   --info name           Specify an info file
   --label-break n       Add newline if label exceeds length n
   --mnemonic-column n   Specify mnemonic start column
   --pagelength n        Set the page length for the listing
   --start-addr addr     Set the start/load address
   --text-column n       Specify text start column
   --verbose             Increase verbosity
   --version             Print the disassembler version
 ---------------------------------------------------------------------------

2.2 Command line options in detail

Here is a description of all the command line options:

--argument-column n

        Specifies the column where the argument for a mnemonic or pseudo
        instruction starts.

--comment-column n

        Specifies the column where the comment for an instruction starts.

--comments n

        Set the comment level for the output. Valid arguments are 0..4. Greater
        values will increase the level of additional information written to the
        output file in form of comments.

--cpu type

        Set the CPU type. The option takes a parameter, which may be one of
           * 6502
           * 6502x
           * 65sc02
           * 65c02
           * huc6280

        6502x is for the NMOS 6502 with unofficial opcodes. huc6280 is the CPU
        of the PC engine. Support for the 65816 currently is not available.

-F, --formfeeds

        Add formfeeds to the generated output. This feature is useful together
        with the --pagelength option. If --formfeeds is given, a formfeed is
        added to the output after each page.

-g, --debug-info

        This option adds the .DEBUGINFO command to the output file, so the
        assembler will generate debug information when re-assembling the
        generated output.

-h, --help

        Print the short option summary shown above.

--hexoffs

        Output label offsets in hexadecimal instead of decimal notation.

-i name, --info name

        Specify an info file. The info file contains global options that may
        override or replace command line options plus informations about the
        code that has to be disassembled. See the separate section Info File
        Format.

-o name

        Specify a name for an output file. The default is to use stdout, so
        without this switch or the corresponding global option OUTPUTNAME, the
        output will go to the terminal.

--label-break n

        Adds a newline if the length of a label exceeds the given length. Note:
        If the label would run into the code in the mid column, a linefeed is
        always inserted regardless of this setting.

        This option overrides the global option LABELBREAK.

--mnemonic-column n

        Specifies the column where a mnemonic or pseudo instrcuction is output.

--pagelength n

        Sets the length of a listing page in lines. After this number of lines,
        a new page header is generated. If the --formfeeds is also given, a
        formfeed is inserted before generating the page header.

        A value of zero for the page length will disable paging of the output.

-S addr, --start-addr addr

        Specify the start/load address of the binary code that is going to be
        disassembled. The given address is interpreted as an octal value if
        preceded with a '0' digit, as a hexadecimal value if preceded with '0x',
        '0X', or '$', and as a decimal value in all other cases. If no start
        address is specified, $10000 minus the size of the input file is used.

--text-column n

        Specifies the column where additional text is output. This additional
        text consists of the bytes encoded in this line in text representation.

-v, --verbose

        Increase the disassembler verbosity. Usually only needed for debugging
        purposes. You may use this option more than one time for even more
        verbose output.

-V, --version

        Print the version number of the assembler. If you send any suggestions
        or bugfixes, please include the version number.

3. Detailed workings

3.1 Supported CPUs

The default (no CPU given on the command line or in the GLOBAL section of the
info file) is the 6502 CPU. The disassembler knows all "official" opcodes for
this CPU. Invalid opcodes are translated into .byte commands.

With the command line option --cpu, the disassembler may be told to recognize
either the 65SC02 or 65C02 CPUs. The latter understands the same opcodes as the
former, plus 16 additional bit manipulation and bit test-and-branch commands.

While there is some code for the 65816 in the sources, it is currently
unsupported.

3.2 Attribute map

The disassembler works by creating an attribute map for the whole address space
($0000 - $FFFF). Initially, all attributes are cleared. Then, an external info
file (if given) is read. Disassembly is done in several passes. In all passes,
with the exception of the last one, information about the disassembled code is
gathered and added to the symbol and attribute maps. The last pass generates
output using the information from the maps.

3.3 Labels

Some instructions may generate labels in the first pass, while most other
instructions do not generate labels, but use them if they are available. Among
others, the branch and jump instructions will generate labels for the target of
the branch in the first pass. External labels (taken from the info file) have
precedence over internally generated ones, They must be valid identifiers as
specified for the ca65 assembler. Internal labels (generated by the
disassembler) have the form Labcd, where abcd is the hexadecimal address of the
label in upper case letters. You should probably avoid using such label names
for external labels.

3.4 Info File

The info file is used to pass additional information about the input code to the
disassembler. This includes label names, data areas or tables, and global
options like input and output file names. See the next section for more
information.

4. Info File Format

The info file contains lists of specifications grouped together. Each group
directive has an identifying token and an attribute list enclosed in curly
braces. Attributes have a name followed by a value. The syntax of the value
depends on the type of the attribute. String attributes are places in double
quotes, numeric attributes may be specified as decimal numbers or hexadecimal
with a leading dollar sign. There are also attributes where the attribute value
is a keyword; in this case, the keyword is given as-is (without quotes or
anything). Each attribute is terminated by a semicolon.

         group-name { attribute1 attribute-value; attribute2 attribute-value; }

4.1 Comments

Comments start with a hash mark (#); and, extend from the position of the mark
to the end of the current line. Hash marks inside of strings will not start a
comment, of course.

4.2 Specifying global options

Global options may be specified in a group with the name GLOBAL. The following
attributes are recognized:

ARGUMENTCOLUMN

        This attribute specifies the column in the output, where the argument
        for an opcode or pseudo instruction starts. The corresponding command
        line option is --argument-column.

COMMENTCOLUMN

        This attribute specifies the column in the output, where the comment
        starts in a line. It is only used for in-line comments. The
        corresponding command line option is --comment-column.

COMMENTS

        This attribute may be used instead of the --comments option on the
        command line. It takes a numerical parameter between 0 and 4. Higher
        values increase the amount of information written to the output file in
        form of comments.

CPU

        This attribute may be used instead of the --cpu option on the command
        line. For possible values see there. The value is a string and must be
        enclosed in quotes.

HEXOFFS

        The attribute is followed by a boolean value. If true, offsets to labels
        are output in hex, otherwise they're output in decimal notation. The
        default is false. The attribute may be changed on the command line using
        the --hexoffs option.

INPUTNAME

        The attribute is followed by a string value, which gives the name of the
        input file to read. If it is present, the disassembler does not accept
        an input file name on the command line.

INPUTOFFS

        The attribute is followed by a numerical value that gives an offset into
        the input file which is skipped before reading data. The attribute may
        be used to skip headers or unwanted code sections in the input file.

INPUTSIZE

        INPUTSIZE is followed by a numerical value that gives the amount of data
        to read from the input file. Data beyond INPUTOFFS + INPUTSIZE is
        ignored.

LABELBREAK

        LABELBREAK is followed by a numerical value that specifies the label
        length that will force a newline. To have all labels on their own lines,
        you may set this value to zero.

        See also the --label-break command line option. A LABELBREAK statement
        in the info file will override any value given on the command line.

MNEMONICCOLUMN

        This attribute specifies the column in the output, where the mnemonic or
        pseudo instruction is placed. The corresponding command line option is
        --mnemonic-column.

NEWLINEAFTERJMP

        This attribute is followed by a boolean value. When true, a newline is
        inserted after each JMP instruction. The default is false.

NEWLINEAFTERRTS

        This attribute is followed by a boolean value. When true, a newline is
        inserted after each RTS instruction. The default is false.

OUTPUTNAME

        The attribute is followed by string value, which gives the name of the
        output file to write. If it is present, specification of an output file
        on the command line using the -o option is not allowed.

        The default is to use stdout for output, so without this attribute or
        the corresponding command line option -o the output will go to the
        terminal.

PAGELENGTH

        This attribute may be used instead of the --pagelength option on the
        command line. It takes a numerical parameter. Using zero as page length
        (which is the default) means that no pages are generated.

STARTADDR

        This attribute may be used instead of the --start-addr option on the
        command line. It takes a numerical parameter. The default for the start
        address is $10000 minus the size of the input file (this assumes that
        the input file is a ROM that contains the reset and irq vectors).

TEXTCOLUMN

        This attribute specifies the column, where the data bytes are output
        translated into ASCII text. It is only used if COMMENTS is set to at
        least 4. The corresponding command line option is --text-column.

4.3 Specifying Ranges

The RANGE directive is used to give information about address ranges. The
following attributes are recognized:

COMMENT

        This attribute is only allowed if a label is also given. It takes a
        string as argument. See the description of the LABEL directive for an
        explanation.

END

        This gives the end address of the range. The end address is inclusive,
        that means, it is part of the range. Of course, it may not be smaller
        than the start address.

NAME

        This is a convenience attribute. It takes a string argument and will
        cause the disassembler to define a label for the start of the range with
        the given name. So a separate LABEL directive is not needed.

START

        This gives the start address of the range.

TYPE

        This attribute specifies the type of data within the range. The
        attribute value is one of the following keywords:

             ADDRTABLE

                     The range consists of data and is disassembled as a table
                     of words (16 bit values). The difference to the WORDTABLE
                     type is that a label is defined for each entry in the
                     table.

             BYTETABLE

                     The range consists of data and is disassembled as a byte
                     table.

             CODE

                     The range consists of code.

             DBYTETABLE

                     The range consists of data and is disassembled as a table
                     of dbytes (double byte values, 16 bit values with the low
                     byte containing the most significant byte of the 16 bit
                     value).

             DWORDTABLE

                     The range consists of data and is disassembled as a table
                     of double words (32 bit values).

             RTSTABLE

                     The range consists of data and is disassembled as a table
                     of words (16 bit values). The values are interpreted as
                     words that are pushed onto the stack and jump to it via
                     RTS. This means that they contain address-1 of a function,
                     for which a label will get defined by the disassembler.

             SKIP

                     The range is simply ignored when generating the output
                     file. Please note that this means that reassembling the
                     output file will not generate the original file, not only
                     because the missing piece in between, but also because the
                     following code will be located on wrong addresses. Output
                     generated with SKIP ranges will need manual rework.

             TEXTTABLE

                     The range consists of readable text.

             WORDTABLE

                     The range consists of data and is disassembled as a table
                     of words (16 bit values).

4.4 Specifying Labels

The LABEL directive is used to give names for labels in the disassembled code.
The following attributes are recognized:

ADDR

        Followed by a numerical value. Specifies the value of the label.

COMMENT

        Attribute argument is a string. The comment will show up in a separate
        line before the label, if the label is within code or data range, or
        after the label if it is outside.

        Example output:

         foo     := $0001        ; Comment for label named "foo"

         ; Comment for label named "bar"
         bar:

NAME

        The attribute is followed by a string value which gives the name of the
        label. Empty names are allowed, in this case the disassembler will
        create an unnamed label (see the assembler docs for more information
        about unnamed labels).

SIZE

        This attribute is optional and may be used to specify the size of the
        data that follows. If a size greater than 1 is specified, the
        disassembler will create labels in the form label+offs for all bytes
        within the given range, where label is the label name given with the
        NAME attribute, and offs is the offset within the data.

4.5 Specifying Segments

The SEGMENT directive is used to specify a segment within the disassembled code.
The following attributes are recognized:

START

        Followed by a numerical value. Specifies the start address of the
        segment.

END

        Followed by a numerical value. Specifies the end address of the segment.
        The end address is the last address that is a part of the segment.

NAME

        The attribute is followed by a string value which gives the name of the
        segment.

All attributes are mandatory. Segments must not overlap. The disassembler will
change back to the (default) .code segment after the end of each defined
segment. That might not be what you want. As a rule of thumb, if you're using
segments, you should define segments for all disassembled code.

4.6 Specifying Assembler Includes

The ASMINC directive is used to give the names of input files containing symbol
assignments in assembler syntax:

         Name = value
         Name := value

The usual conventions apply for symbol names. Values may be specified as hex
(leading $), binary (leading %) or decimal. The values may optionally be signed.

NOTE: The include file parser is very simple. Expressions are not allowed, and
anything but symbol assignments is flagged as an error (but see the
IGNOREUNKNOWN directive below).

The following attributes are recognized:

FILE

        Followed by a string value. Specifies the name of the file to read.

COMMENTSTART

        The optional attribute is followed by a character constant. It specifies
        the character that starts a comment. The default value is a semicolon.
        This value is ignored if IGNOREUNKNOWN is true.

IGNOREUNKNOWN

        This attribute is optional and is followed by a boolean value. It allows
        to ignore input lines that don't have a valid syntax. This allows to
        read in assembler include files that contain more than just symbol
        assignments. Note: When this attribute is used, the disassembler will
        ignore any errors in the given include file. This may have undesired
        side effects.

4.7 An Info File Example

The following is a short example for an info file that contains most of the
directives explained above:

         # This is a comment. It extends to the end of the line
         GLOBAL {
             OUTPUTNAME      "kernal.s";
             INPUTNAME       "kernal.bin";
             STARTADDR       $E000;
             PAGELENGTH      0;                  # No paging
             CPU             "6502";
         };

         # One segment for the whole stuff
         SEGMENT { START $E000;  END   $FFFF; NAME "kernal"; };

         RANGE { START $E612;    END   $E631; TYPE Code;      };
         RANGE { START $E632;    END   $E640; TYPE ByteTable; };
         RANGE { START $EA51;    END   $EA84; TYPE RtsTable;  };
         RANGE { START $EC6C;    END   $ECAB; TYPE RtsTable;  };
         RANGE { START $ED08;    END   $ED11; TYPE AddrTable; };

         # Zero-page variables
         LABEL { NAME "fnadr";   ADDR  $90;   SIZE 3;    };
         LABEL { NAME "sal";     ADDR  $93;   };
         LABEL { NAME "sah";     ADDR  $94;   };
         LABEL { NAME "sas";     ADDR  $95;   };

         # Stack
         LABEL { NAME "stack";   ADDR  $100;  SIZE 255;  };

         # Indirect vectors
         LABEL { NAME "cinv";    ADDR  $300;  SIZE 2;    };      # IRQ
         LABEL { NAME "cbinv";   ADDR  $302;  SIZE 2;    };      # BRK
         LABEL { NAME "nminv";   ADDR  $304;  SIZE 2;    };      # NMI

         # Jump table at end of kernal ROM
         LABEL { NAME "kscrorg"; ADDR  $FFED; };
         LABEL { NAME "kplot";   ADDR  $FFF0; };
         LABEL { NAME "kiobase"; ADDR  $FFF3; };
         LABEL { NAME "kgbye";   ADDR  $FFF6; };

         # Hardware vectors
         LABEL { NAME "hanmi";   ADDR  $FFFA; };
         LABEL { NAME "hares";   ADDR  $FFFC; };
         LABEL { NAME "hairq";   ADDR  $FFFE; };

5. Copyright

da65 (and all cc65 binutils) is (C) Copyright 1998-2011, Ullrich von Bassewitz.
For usage of the binaries and/or sources, the following conditions do apply:

This software is provided 'as-is', without any expressed or implied warranty. In
no event will the authors be held liable for any damages arising from the use of
this software.

Permission is granted to anyone to use this software for any purpose, including
commercial applications, and to alter it and redistribute it freely, subject to
the following restrictions:

 1. The origin of this software must not be misrepresented; you must not claim
    that you wrote the original software. If you use this software in a product,
    an acknowledgment in the product documentation would be appreciated but is
    not required.
 2. Altered source versions must be plainly marked as such, and must not be
    misrepresented as being the original software.
 3. This notice may not be removed or altered from any source distribution.