From 26b42926816662ce878e814938a1ebc0aa1847c2 Mon Sep 17 00:00:00 2001 From: "B. Watson" Date: Mon, 29 Aug 2016 15:00:13 -0400 Subject: finally made a git repo for this --- da65.txt | 628 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 628 insertions(+) create mode 100644 da65.txt (limited to 'da65.txt') diff --git a/da65.txt b/da65.txt new file mode 100644 index 0000000..19115f7 --- /dev/null +++ b/da65.txt @@ -0,0 +1,628 @@ + da65 Users Guide + +Ullrich von Bassewitz, +Greg King + +2014-11-23 + + ---------------------------------------------------------------------------- + +da65 is a 6502/65C02 disassembler that is able to read user-supplied information +about its input data, for better results. The output is ready for feeding into +ca65, the macro assembler supplied with the cc65 C compiler. + + ---------------------------------------------------------------------------- + +1. Overview + +2. Usage + + * 2.1 Command line option overview + * 2.2 Command line options in detail + +3. Detailed workings + + * 3.1 Supported CPUs + * 3.2 Attribute map + * 3.3 Labels + * 3.4 Info File + +4. Info File Format + + * 4.1 Comments + * 4.2 Specifying global options + * 4.3 Specifying Ranges + * 4.4 Specifying Labels + * 4.5 Specifying Segments + * 4.6 Specifying Assembler Includes + * 4.7 An Info File Example + +5. Copyright + + ---------------------------------------------------------------------------- + +1. Overview + +da65 is a disassembler for 6502/65C02 code. It is supplied as a utility with the +cc65 C compiler and generates output that is suitable for the ca65 macro +assembler. + +Besides generating output for ca65, one of the design goals was that the user is +able to feed additional information about the code into the disassembler, for +improved results. This information may include the location and size of tables, +and their format. + +One nice advantage of this concept is that disassembly of copyrighted binaries +may be handled without problems: One can just pass the information file for +disassembling the binary, so everyone with a legal copy of the binary can +generate a nicely formatted disassembly with readable labels and other +information. + +2. Usage + +2.1 Command line option overview + +The assembler accepts the following options: + + --------------------------------------------------------------------------- + Usage: da65 [options] [inputfile] + Short options: + -g Add debug info to object file + -h Help (this text) + -i name Specify an info file + -o name Name the output file + -v Increase verbosity + -F Add formfeeds to the output + -S addr Set the start/load address + -V Print the disassembler version + + Long options: + --argument-column n Specify argument start column + --comment-column n Specify comment start column + --comments n Set the comment level for the output + --cpu type Set cpu type + --debug-info Add debug info to object file + --formfeeds Add formfeeds to the output + --help Help (this text) + --hexoffs Use hexadecimal label offsets + --info name Specify an info file + --label-break n Add newline if label exceeds length n + --mnemonic-column n Specify mnemonic start column + --pagelength n Set the page length for the listing + --start-addr addr Set the start/load address + --text-column n Specify text start column + --verbose Increase verbosity + --version Print the disassembler version + --------------------------------------------------------------------------- + +2.2 Command line options in detail + +Here is a description of all the command line options: + +--argument-column n + + Specifies the column where the argument for a mnemonic or pseudo + instruction starts. + +--comment-column n + + Specifies the column where the comment for an instruction starts. + +--comments n + + Set the comment level for the output. Valid arguments are 0..4. Greater + values will increase the level of additional information written to the + output file in form of comments. + +--cpu type + + Set the CPU type. The option takes a parameter, which may be one of + * 6502 + * 6502x + * 65sc02 + * 65c02 + * huc6280 + + 6502x is for the NMOS 6502 with unofficial opcodes. huc6280 is the CPU + of the PC engine. Support for the 65816 currently is not available. + +-F, --formfeeds + + Add formfeeds to the generated output. This feature is useful together + with the --pagelength option. If --formfeeds is given, a formfeed is + added to the output after each page. + +-g, --debug-info + + This option adds the .DEBUGINFO command to the output file, so the + assembler will generate debug information when re-assembling the + generated output. + +-h, --help + + Print the short option summary shown above. + +--hexoffs + + Output label offsets in hexadecimal instead of decimal notation. + +-i name, --info name + + Specify an info file. The info file contains global options that may + override or replace command line options plus informations about the + code that has to be disassembled. See the separate section Info File + Format. + +-o name + + Specify a name for an output file. The default is to use stdout, so + without this switch or the corresponding global option OUTPUTNAME, the + output will go to the terminal. + +--label-break n + + Adds a newline if the length of a label exceeds the given length. Note: + If the label would run into the code in the mid column, a linefeed is + always inserted regardless of this setting. + + This option overrides the global option LABELBREAK. + +--mnemonic-column n + + Specifies the column where a mnemonic or pseudo instrcuction is output. + +--pagelength n + + Sets the length of a listing page in lines. After this number of lines, + a new page header is generated. If the --formfeeds is also given, a + formfeed is inserted before generating the page header. + + A value of zero for the page length will disable paging of the output. + +-S addr, --start-addr addr + + Specify the start/load address of the binary code that is going to be + disassembled. The given address is interpreted as an octal value if + preceded with a '0' digit, as a hexadecimal value if preceded with '0x', + '0X', or '$', and as a decimal value in all other cases. If no start + address is specified, $10000 minus the size of the input file is used. + +--text-column n + + Specifies the column where additional text is output. This additional + text consists of the bytes encoded in this line in text representation. + +-v, --verbose + + Increase the disassembler verbosity. Usually only needed for debugging + purposes. You may use this option more than one time for even more + verbose output. + +-V, --version + + Print the version number of the assembler. If you send any suggestions + or bugfixes, please include the version number. + +3. Detailed workings + +3.1 Supported CPUs + +The default (no CPU given on the command line or in the GLOBAL section of the +info file) is the 6502 CPU. The disassembler knows all "official" opcodes for +this CPU. Invalid opcodes are translated into .byte commands. + +With the command line option --cpu, the disassembler may be told to recognize +either the 65SC02 or 65C02 CPUs. The latter understands the same opcodes as the +former, plus 16 additional bit manipulation and bit test-and-branch commands. + +While there is some code for the 65816 in the sources, it is currently +unsupported. + +3.2 Attribute map + +The disassembler works by creating an attribute map for the whole address space +($0000 - $FFFF). Initially, all attributes are cleared. Then, an external info +file (if given) is read. Disassembly is done in several passes. In all passes, +with the exception of the last one, information about the disassembled code is +gathered and added to the symbol and attribute maps. The last pass generates +output using the information from the maps. + +3.3 Labels + +Some instructions may generate labels in the first pass, while most other +instructions do not generate labels, but use them if they are available. Among +others, the branch and jump instructions will generate labels for the target of +the branch in the first pass. External labels (taken from the info file) have +precedence over internally generated ones, They must be valid identifiers as +specified for the ca65 assembler. Internal labels (generated by the +disassembler) have the form Labcd, where abcd is the hexadecimal address of the +label in upper case letters. You should probably avoid using such label names +for external labels. + +3.4 Info File + +The info file is used to pass additional information about the input code to the +disassembler. This includes label names, data areas or tables, and global +options like input and output file names. See the next section for more +information. + +4. Info File Format + +The info file contains lists of specifications grouped together. Each group +directive has an identifying token and an attribute list enclosed in curly +braces. Attributes have a name followed by a value. The syntax of the value +depends on the type of the attribute. String attributes are places in double +quotes, numeric attributes may be specified as decimal numbers or hexadecimal +with a leading dollar sign. There are also attributes where the attribute value +is a keyword; in this case, the keyword is given as-is (without quotes or +anything). Each attribute is terminated by a semicolon. + + group-name { attribute1 attribute-value; attribute2 attribute-value; } + +4.1 Comments + +Comments start with a hash mark (#); and, extend from the position of the mark +to the end of the current line. Hash marks inside of strings will not start a +comment, of course. + +4.2 Specifying global options + +Global options may be specified in a group with the name GLOBAL. The following +attributes are recognized: + +ARGUMENTCOLUMN + + This attribute specifies the column in the output, where the argument + for an opcode or pseudo instruction starts. The corresponding command + line option is --argument-column. + +COMMENTCOLUMN + + This attribute specifies the column in the output, where the comment + starts in a line. It is only used for in-line comments. The + corresponding command line option is --comment-column. + +COMMENTS + + This attribute may be used instead of the --comments option on the + command line. It takes a numerical parameter between 0 and 4. Higher + values increase the amount of information written to the output file in + form of comments. + +CPU + + This attribute may be used instead of the --cpu option on the command + line. For possible values see there. The value is a string and must be + enclosed in quotes. + +HEXOFFS + + The attribute is followed by a boolean value. If true, offsets to labels + are output in hex, otherwise they're output in decimal notation. The + default is false. The attribute may be changed on the command line using + the --hexoffs option. + +INPUTNAME + + The attribute is followed by a string value, which gives the name of the + input file to read. If it is present, the disassembler does not accept + an input file name on the command line. + +INPUTOFFS + + The attribute is followed by a numerical value that gives an offset into + the input file which is skipped before reading data. The attribute may + be used to skip headers or unwanted code sections in the input file. + +INPUTSIZE + + INPUTSIZE is followed by a numerical value that gives the amount of data + to read from the input file. Data beyond INPUTOFFS + INPUTSIZE is + ignored. + +LABELBREAK + + LABELBREAK is followed by a numerical value that specifies the label + length that will force a newline. To have all labels on their own lines, + you may set this value to zero. + + See also the --label-break command line option. A LABELBREAK statement + in the info file will override any value given on the command line. + +MNEMONICCOLUMN + + This attribute specifies the column in the output, where the mnemonic or + pseudo instruction is placed. The corresponding command line option is + --mnemonic-column. + +NEWLINEAFTERJMP + + This attribute is followed by a boolean value. When true, a newline is + inserted after each JMP instruction. The default is false. + +NEWLINEAFTERRTS + + This attribute is followed by a boolean value. When true, a newline is + inserted after each RTS instruction. The default is false. + +OUTPUTNAME + + The attribute is followed by string value, which gives the name of the + output file to write. If it is present, specification of an output file + on the command line using the -o option is not allowed. + + The default is to use stdout for output, so without this attribute or + the corresponding command line option -o the output will go to the + terminal. + +PAGELENGTH + + This attribute may be used instead of the --pagelength option on the + command line. It takes a numerical parameter. Using zero as page length + (which is the default) means that no pages are generated. + +STARTADDR + + This attribute may be used instead of the --start-addr option on the + command line. It takes a numerical parameter. The default for the start + address is $10000 minus the size of the input file (this assumes that + the input file is a ROM that contains the reset and irq vectors). + +TEXTCOLUMN + + This attribute specifies the column, where the data bytes are output + translated into ASCII text. It is only used if COMMENTS is set to at + least 4. The corresponding command line option is --text-column. + +4.3 Specifying Ranges + +The RANGE directive is used to give information about address ranges. The +following attributes are recognized: + +COMMENT + + This attribute is only allowed if a label is also given. It takes a + string as argument. See the description of the LABEL directive for an + explanation. + +END + + This gives the end address of the range. The end address is inclusive, + that means, it is part of the range. Of course, it may not be smaller + than the start address. + +NAME + + This is a convenience attribute. It takes a string argument and will + cause the disassembler to define a label for the start of the range with + the given name. So a separate LABEL directive is not needed. + +START + + This gives the start address of the range. + +TYPE + + This attribute specifies the type of data within the range. The + attribute value is one of the following keywords: + + ADDRTABLE + + The range consists of data and is disassembled as a table + of words (16 bit values). The difference to the WORDTABLE + type is that a label is defined for each entry in the + table. + + BYTETABLE + + The range consists of data and is disassembled as a byte + table. + + CODE + + The range consists of code. + + DBYTETABLE + + The range consists of data and is disassembled as a table + of dbytes (double byte values, 16 bit values with the low + byte containing the most significant byte of the 16 bit + value). + + DWORDTABLE + + The range consists of data and is disassembled as a table + of double words (32 bit values). + + RTSTABLE + + The range consists of data and is disassembled as a table + of words (16 bit values). The values are interpreted as + words that are pushed onto the stack and jump to it via + RTS. This means that they contain address-1 of a function, + for which a label will get defined by the disassembler. + + SKIP + + The range is simply ignored when generating the output + file. Please note that this means that reassembling the + output file will not generate the original file, not only + because the missing piece in between, but also because the + following code will be located on wrong addresses. Output + generated with SKIP ranges will need manual rework. + + TEXTTABLE + + The range consists of readable text. + + WORDTABLE + + The range consists of data and is disassembled as a table + of words (16 bit values). + +4.4 Specifying Labels + +The LABEL directive is used to give names for labels in the disassembled code. +The following attributes are recognized: + +ADDR + + Followed by a numerical value. Specifies the value of the label. + +COMMENT + + Attribute argument is a string. The comment will show up in a separate + line before the label, if the label is within code or data range, or + after the label if it is outside. + + Example output: + + foo := $0001 ; Comment for label named "foo" + + ; Comment for label named "bar" + bar: + +NAME + + The attribute is followed by a string value which gives the name of the + label. Empty names are allowed, in this case the disassembler will + create an unnamed label (see the assembler docs for more information + about unnamed labels). + +SIZE + + This attribute is optional and may be used to specify the size of the + data that follows. If a size greater than 1 is specified, the + disassembler will create labels in the form label+offs for all bytes + within the given range, where label is the label name given with the + NAME attribute, and offs is the offset within the data. + +4.5 Specifying Segments + +The SEGMENT directive is used to specify a segment within the disassembled code. +The following attributes are recognized: + +START + + Followed by a numerical value. Specifies the start address of the + segment. + +END + + Followed by a numerical value. Specifies the end address of the segment. + The end address is the last address that is a part of the segment. + +NAME + + The attribute is followed by a string value which gives the name of the + segment. + +All attributes are mandatory. Segments must not overlap. The disassembler will +change back to the (default) .code segment after the end of each defined +segment. That might not be what you want. As a rule of thumb, if you're using +segments, you should define segments for all disassembled code. + +4.6 Specifying Assembler Includes + +The ASMINC directive is used to give the names of input files containing symbol +assignments in assembler syntax: + + Name = value + Name := value + +The usual conventions apply for symbol names. Values may be specified as hex +(leading $), binary (leading %) or decimal. The values may optionally be signed. + +NOTE: The include file parser is very simple. Expressions are not allowed, and +anything but symbol assignments is flagged as an error (but see the +IGNOREUNKNOWN directive below). + +The following attributes are recognized: + +FILE + + Followed by a string value. Specifies the name of the file to read. + +COMMENTSTART + + The optional attribute is followed by a character constant. It specifies + the character that starts a comment. The default value is a semicolon. + This value is ignored if IGNOREUNKNOWN is true. + +IGNOREUNKNOWN + + This attribute is optional and is followed by a boolean value. It allows + to ignore input lines that don't have a valid syntax. This allows to + read in assembler include files that contain more than just symbol + assignments. Note: When this attribute is used, the disassembler will + ignore any errors in the given include file. This may have undesired + side effects. + +4.7 An Info File Example + +The following is a short example for an info file that contains most of the +directives explained above: + + # This is a comment. It extends to the end of the line + GLOBAL { + OUTPUTNAME "kernal.s"; + INPUTNAME "kernal.bin"; + STARTADDR $E000; + PAGELENGTH 0; # No paging + CPU "6502"; + }; + + # One segment for the whole stuff + SEGMENT { START $E000; END $FFFF; NAME "kernal"; }; + + RANGE { START $E612; END $E631; TYPE Code; }; + RANGE { START $E632; END $E640; TYPE ByteTable; }; + RANGE { START $EA51; END $EA84; TYPE RtsTable; }; + RANGE { START $EC6C; END $ECAB; TYPE RtsTable; }; + RANGE { START $ED08; END $ED11; TYPE AddrTable; }; + + # Zero-page variables + LABEL { NAME "fnadr"; ADDR $90; SIZE 3; }; + LABEL { NAME "sal"; ADDR $93; }; + LABEL { NAME "sah"; ADDR $94; }; + LABEL { NAME "sas"; ADDR $95; }; + + # Stack + LABEL { NAME "stack"; ADDR $100; SIZE 255; }; + + # Indirect vectors + LABEL { NAME "cinv"; ADDR $300; SIZE 2; }; # IRQ + LABEL { NAME "cbinv"; ADDR $302; SIZE 2; }; # BRK + LABEL { NAME "nminv"; ADDR $304; SIZE 2; }; # NMI + + # Jump table at end of kernal ROM + LABEL { NAME "kscrorg"; ADDR $FFED; }; + LABEL { NAME "kplot"; ADDR $FFF0; }; + LABEL { NAME "kiobase"; ADDR $FFF3; }; + LABEL { NAME "kgbye"; ADDR $FFF6; }; + + # Hardware vectors + LABEL { NAME "hanmi"; ADDR $FFFA; }; + LABEL { NAME "hares"; ADDR $FFFC; }; + LABEL { NAME "hairq"; ADDR $FFFE; }; + +5. Copyright + +da65 (and all cc65 binutils) is (C) Copyright 1998-2011, Ullrich von Bassewitz. +For usage of the binaries and/or sources, the following conditions do apply: + +This software is provided 'as-is', without any expressed or implied warranty. In +no event will the authors be held liable for any damages arising from the use of +this software. + +Permission is granted to anyone to use this software for any purpose, including +commercial applications, and to alter it and redistribute it freely, subject to +the following restrictions: + + 1. The origin of this software must not be misrepresented; you must not claim + that you wrote the original software. If you use this software in a product, + an acknowledgment in the product documentation would be appreciated but is + not required. + 2. Altered source versions must be plainly marked as such, and must not be + misrepresented as being the original software. + 3. This notice may not be removed or altered from any source distribution. -- cgit v1.2.3