1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
|
da65 Users Guide
Ullrich von Bassewitz,
Greg King
2014-11-23
----------------------------------------------------------------------------
da65 is a 6502/65C02 disassembler that is able to read user-supplied information
about its input data, for better results. The output is ready for feeding into
ca65, the macro assembler supplied with the cc65 C compiler.
----------------------------------------------------------------------------
1. Overview
2. Usage
* 2.1 Command line option overview
* 2.2 Command line options in detail
3. Detailed workings
* 3.1 Supported CPUs
* 3.2 Attribute map
* 3.3 Labels
* 3.4 Info File
4. Info File Format
* 4.1 Comments
* 4.2 Specifying global options
* 4.3 Specifying Ranges
* 4.4 Specifying Labels
* 4.5 Specifying Segments
* 4.6 Specifying Assembler Includes
* 4.7 An Info File Example
5. Copyright
----------------------------------------------------------------------------
1. Overview
da65 is a disassembler for 6502/65C02 code. It is supplied as a utility with the
cc65 C compiler and generates output that is suitable for the ca65 macro
assembler.
Besides generating output for ca65, one of the design goals was that the user is
able to feed additional information about the code into the disassembler, for
improved results. This information may include the location and size of tables,
and their format.
One nice advantage of this concept is that disassembly of copyrighted binaries
may be handled without problems: One can just pass the information file for
disassembling the binary, so everyone with a legal copy of the binary can
generate a nicely formatted disassembly with readable labels and other
information.
2. Usage
2.1 Command line option overview
The assembler accepts the following options:
---------------------------------------------------------------------------
Usage: da65 [options] [inputfile]
Short options:
-g Add debug info to object file
-h Help (this text)
-i name Specify an info file
-o name Name the output file
-v Increase verbosity
-F Add formfeeds to the output
-S addr Set the start/load address
-V Print the disassembler version
Long options:
--argument-column n Specify argument start column
--comment-column n Specify comment start column
--comments n Set the comment level for the output
--cpu type Set cpu type
--debug-info Add debug info to object file
--formfeeds Add formfeeds to the output
--help Help (this text)
--hexoffs Use hexadecimal label offsets
--info name Specify an info file
--label-break n Add newline if label exceeds length n
--mnemonic-column n Specify mnemonic start column
--pagelength n Set the page length for the listing
--start-addr addr Set the start/load address
--text-column n Specify text start column
--verbose Increase verbosity
--version Print the disassembler version
---------------------------------------------------------------------------
2.2 Command line options in detail
Here is a description of all the command line options:
--argument-column n
Specifies the column where the argument for a mnemonic or pseudo
instruction starts.
--comment-column n
Specifies the column where the comment for an instruction starts.
--comments n
Set the comment level for the output. Valid arguments are 0..4. Greater
values will increase the level of additional information written to the
output file in form of comments.
--cpu type
Set the CPU type. The option takes a parameter, which may be one of
* 6502
* 6502x
* 65sc02
* 65c02
* huc6280
6502x is for the NMOS 6502 with unofficial opcodes. huc6280 is the CPU
of the PC engine. Support for the 65816 currently is not available.
-F, --formfeeds
Add formfeeds to the generated output. This feature is useful together
with the --pagelength option. If --formfeeds is given, a formfeed is
added to the output after each page.
-g, --debug-info
This option adds the .DEBUGINFO command to the output file, so the
assembler will generate debug information when re-assembling the
generated output.
-h, --help
Print the short option summary shown above.
--hexoffs
Output label offsets in hexadecimal instead of decimal notation.
-i name, --info name
Specify an info file. The info file contains global options that may
override or replace command line options plus informations about the
code that has to be disassembled. See the separate section Info File
Format.
-o name
Specify a name for an output file. The default is to use stdout, so
without this switch or the corresponding global option OUTPUTNAME, the
output will go to the terminal.
--label-break n
Adds a newline if the length of a label exceeds the given length. Note:
If the label would run into the code in the mid column, a linefeed is
always inserted regardless of this setting.
This option overrides the global option LABELBREAK.
--mnemonic-column n
Specifies the column where a mnemonic or pseudo instrcuction is output.
--pagelength n
Sets the length of a listing page in lines. After this number of lines,
a new page header is generated. If the --formfeeds is also given, a
formfeed is inserted before generating the page header.
A value of zero for the page length will disable paging of the output.
-S addr, --start-addr addr
Specify the start/load address of the binary code that is going to be
disassembled. The given address is interpreted as an octal value if
preceded with a '0' digit, as a hexadecimal value if preceded with '0x',
'0X', or '$', and as a decimal value in all other cases. If no start
address is specified, $10000 minus the size of the input file is used.
--text-column n
Specifies the column where additional text is output. This additional
text consists of the bytes encoded in this line in text representation.
-v, --verbose
Increase the disassembler verbosity. Usually only needed for debugging
purposes. You may use this option more than one time for even more
verbose output.
-V, --version
Print the version number of the assembler. If you send any suggestions
or bugfixes, please include the version number.
3. Detailed workings
3.1 Supported CPUs
The default (no CPU given on the command line or in the GLOBAL section of the
info file) is the 6502 CPU. The disassembler knows all "official" opcodes for
this CPU. Invalid opcodes are translated into .byte commands.
With the command line option --cpu, the disassembler may be told to recognize
either the 65SC02 or 65C02 CPUs. The latter understands the same opcodes as the
former, plus 16 additional bit manipulation and bit test-and-branch commands.
While there is some code for the 65816 in the sources, it is currently
unsupported.
3.2 Attribute map
The disassembler works by creating an attribute map for the whole address space
($0000 - $FFFF). Initially, all attributes are cleared. Then, an external info
file (if given) is read. Disassembly is done in several passes. In all passes,
with the exception of the last one, information about the disassembled code is
gathered and added to the symbol and attribute maps. The last pass generates
output using the information from the maps.
3.3 Labels
Some instructions may generate labels in the first pass, while most other
instructions do not generate labels, but use them if they are available. Among
others, the branch and jump instructions will generate labels for the target of
the branch in the first pass. External labels (taken from the info file) have
precedence over internally generated ones, They must be valid identifiers as
specified for the ca65 assembler. Internal labels (generated by the
disassembler) have the form Labcd, where abcd is the hexadecimal address of the
label in upper case letters. You should probably avoid using such label names
for external labels.
3.4 Info File
The info file is used to pass additional information about the input code to the
disassembler. This includes label names, data areas or tables, and global
options like input and output file names. See the next section for more
information.
4. Info File Format
The info file contains lists of specifications grouped together. Each group
directive has an identifying token and an attribute list enclosed in curly
braces. Attributes have a name followed by a value. The syntax of the value
depends on the type of the attribute. String attributes are places in double
quotes, numeric attributes may be specified as decimal numbers or hexadecimal
with a leading dollar sign. There are also attributes where the attribute value
is a keyword; in this case, the keyword is given as-is (without quotes or
anything). Each attribute is terminated by a semicolon.
group-name { attribute1 attribute-value; attribute2 attribute-value; }
4.1 Comments
Comments start with a hash mark (#); and, extend from the position of the mark
to the end of the current line. Hash marks inside of strings will not start a
comment, of course.
4.2 Specifying global options
Global options may be specified in a group with the name GLOBAL. The following
attributes are recognized:
ARGUMENTCOLUMN
This attribute specifies the column in the output, where the argument
for an opcode or pseudo instruction starts. The corresponding command
line option is --argument-column.
COMMENTCOLUMN
This attribute specifies the column in the output, where the comment
starts in a line. It is only used for in-line comments. The
corresponding command line option is --comment-column.
COMMENTS
This attribute may be used instead of the --comments option on the
command line. It takes a numerical parameter between 0 and 4. Higher
values increase the amount of information written to the output file in
form of comments.
CPU
This attribute may be used instead of the --cpu option on the command
line. For possible values see there. The value is a string and must be
enclosed in quotes.
HEXOFFS
The attribute is followed by a boolean value. If true, offsets to labels
are output in hex, otherwise they're output in decimal notation. The
default is false. The attribute may be changed on the command line using
the --hexoffs option.
INPUTNAME
The attribute is followed by a string value, which gives the name of the
input file to read. If it is present, the disassembler does not accept
an input file name on the command line.
INPUTOFFS
The attribute is followed by a numerical value that gives an offset into
the input file which is skipped before reading data. The attribute may
be used to skip headers or unwanted code sections in the input file.
INPUTSIZE
INPUTSIZE is followed by a numerical value that gives the amount of data
to read from the input file. Data beyond INPUTOFFS + INPUTSIZE is
ignored.
LABELBREAK
LABELBREAK is followed by a numerical value that specifies the label
length that will force a newline. To have all labels on their own lines,
you may set this value to zero.
See also the --label-break command line option. A LABELBREAK statement
in the info file will override any value given on the command line.
MNEMONICCOLUMN
This attribute specifies the column in the output, where the mnemonic or
pseudo instruction is placed. The corresponding command line option is
--mnemonic-column.
NEWLINEAFTERJMP
This attribute is followed by a boolean value. When true, a newline is
inserted after each JMP instruction. The default is false.
NEWLINEAFTERRTS
This attribute is followed by a boolean value. When true, a newline is
inserted after each RTS instruction. The default is false.
OUTPUTNAME
The attribute is followed by string value, which gives the name of the
output file to write. If it is present, specification of an output file
on the command line using the -o option is not allowed.
The default is to use stdout for output, so without this attribute or
the corresponding command line option -o the output will go to the
terminal.
PAGELENGTH
This attribute may be used instead of the --pagelength option on the
command line. It takes a numerical parameter. Using zero as page length
(which is the default) means that no pages are generated.
STARTADDR
This attribute may be used instead of the --start-addr option on the
command line. It takes a numerical parameter. The default for the start
address is $10000 minus the size of the input file (this assumes that
the input file is a ROM that contains the reset and irq vectors).
TEXTCOLUMN
This attribute specifies the column, where the data bytes are output
translated into ASCII text. It is only used if COMMENTS is set to at
least 4. The corresponding command line option is --text-column.
4.3 Specifying Ranges
The RANGE directive is used to give information about address ranges. The
following attributes are recognized:
COMMENT
This attribute is only allowed if a label is also given. It takes a
string as argument. See the description of the LABEL directive for an
explanation.
END
This gives the end address of the range. The end address is inclusive,
that means, it is part of the range. Of course, it may not be smaller
than the start address.
NAME
This is a convenience attribute. It takes a string argument and will
cause the disassembler to define a label for the start of the range with
the given name. So a separate LABEL directive is not needed.
START
This gives the start address of the range.
TYPE
This attribute specifies the type of data within the range. The
attribute value is one of the following keywords:
ADDRTABLE
The range consists of data and is disassembled as a table
of words (16 bit values). The difference to the WORDTABLE
type is that a label is defined for each entry in the
table.
BYTETABLE
The range consists of data and is disassembled as a byte
table.
CODE
The range consists of code.
DBYTETABLE
The range consists of data and is disassembled as a table
of dbytes (double byte values, 16 bit values with the low
byte containing the most significant byte of the 16 bit
value).
DWORDTABLE
The range consists of data and is disassembled as a table
of double words (32 bit values).
RTSTABLE
The range consists of data and is disassembled as a table
of words (16 bit values). The values are interpreted as
words that are pushed onto the stack and jump to it via
RTS. This means that they contain address-1 of a function,
for which a label will get defined by the disassembler.
SKIP
The range is simply ignored when generating the output
file. Please note that this means that reassembling the
output file will not generate the original file, not only
because the missing piece in between, but also because the
following code will be located on wrong addresses. Output
generated with SKIP ranges will need manual rework.
TEXTTABLE
The range consists of readable text.
WORDTABLE
The range consists of data and is disassembled as a table
of words (16 bit values).
4.4 Specifying Labels
The LABEL directive is used to give names for labels in the disassembled code.
The following attributes are recognized:
ADDR
Followed by a numerical value. Specifies the value of the label.
COMMENT
Attribute argument is a string. The comment will show up in a separate
line before the label, if the label is within code or data range, or
after the label if it is outside.
Example output:
foo := $0001 ; Comment for label named "foo"
; Comment for label named "bar"
bar:
NAME
The attribute is followed by a string value which gives the name of the
label. Empty names are allowed, in this case the disassembler will
create an unnamed label (see the assembler docs for more information
about unnamed labels).
SIZE
This attribute is optional and may be used to specify the size of the
data that follows. If a size greater than 1 is specified, the
disassembler will create labels in the form label+offs for all bytes
within the given range, where label is the label name given with the
NAME attribute, and offs is the offset within the data.
4.5 Specifying Segments
The SEGMENT directive is used to specify a segment within the disassembled code.
The following attributes are recognized:
START
Followed by a numerical value. Specifies the start address of the
segment.
END
Followed by a numerical value. Specifies the end address of the segment.
The end address is the last address that is a part of the segment.
NAME
The attribute is followed by a string value which gives the name of the
segment.
All attributes are mandatory. Segments must not overlap. The disassembler will
change back to the (default) .code segment after the end of each defined
segment. That might not be what you want. As a rule of thumb, if you're using
segments, you should define segments for all disassembled code.
4.6 Specifying Assembler Includes
The ASMINC directive is used to give the names of input files containing symbol
assignments in assembler syntax:
Name = value
Name := value
The usual conventions apply for symbol names. Values may be specified as hex
(leading $), binary (leading %) or decimal. The values may optionally be signed.
NOTE: The include file parser is very simple. Expressions are not allowed, and
anything but symbol assignments is flagged as an error (but see the
IGNOREUNKNOWN directive below).
The following attributes are recognized:
FILE
Followed by a string value. Specifies the name of the file to read.
COMMENTSTART
The optional attribute is followed by a character constant. It specifies
the character that starts a comment. The default value is a semicolon.
This value is ignored if IGNOREUNKNOWN is true.
IGNOREUNKNOWN
This attribute is optional and is followed by a boolean value. It allows
to ignore input lines that don't have a valid syntax. This allows to
read in assembler include files that contain more than just symbol
assignments. Note: When this attribute is used, the disassembler will
ignore any errors in the given include file. This may have undesired
side effects.
4.7 An Info File Example
The following is a short example for an info file that contains most of the
directives explained above:
# This is a comment. It extends to the end of the line
GLOBAL {
OUTPUTNAME "kernal.s";
INPUTNAME "kernal.bin";
STARTADDR $E000;
PAGELENGTH 0; # No paging
CPU "6502";
};
# One segment for the whole stuff
SEGMENT { START $E000; END $FFFF; NAME "kernal"; };
RANGE { START $E612; END $E631; TYPE Code; };
RANGE { START $E632; END $E640; TYPE ByteTable; };
RANGE { START $EA51; END $EA84; TYPE RtsTable; };
RANGE { START $EC6C; END $ECAB; TYPE RtsTable; };
RANGE { START $ED08; END $ED11; TYPE AddrTable; };
# Zero-page variables
LABEL { NAME "fnadr"; ADDR $90; SIZE 3; };
LABEL { NAME "sal"; ADDR $93; };
LABEL { NAME "sah"; ADDR $94; };
LABEL { NAME "sas"; ADDR $95; };
# Stack
LABEL { NAME "stack"; ADDR $100; SIZE 255; };
# Indirect vectors
LABEL { NAME "cinv"; ADDR $300; SIZE 2; }; # IRQ
LABEL { NAME "cbinv"; ADDR $302; SIZE 2; }; # BRK
LABEL { NAME "nminv"; ADDR $304; SIZE 2; }; # NMI
# Jump table at end of kernal ROM
LABEL { NAME "kscrorg"; ADDR $FFED; };
LABEL { NAME "kplot"; ADDR $FFF0; };
LABEL { NAME "kiobase"; ADDR $FFF3; };
LABEL { NAME "kgbye"; ADDR $FFF6; };
# Hardware vectors
LABEL { NAME "hanmi"; ADDR $FFFA; };
LABEL { NAME "hares"; ADDR $FFFC; };
LABEL { NAME "hairq"; ADDR $FFFE; };
5. Copyright
da65 (and all cc65 binutils) is (C) Copyright 1998-2011, Ullrich von Bassewitz.
For usage of the binaries and/or sources, the following conditions do apply:
This software is provided 'as-is', without any expressed or implied warranty. In
no event will the authors be held liable for any damages arising from the use of
this software.
Permission is granted to anyone to use this software for any purpose, including
commercial applications, and to alter it and redistribute it freely, subject to
the following restrictions:
1. The origin of this software must not be misrepresented; you must not claim
that you wrote the original software. If you use this software in a product,
an acknowledgment in the product documentation would be appreciated but is
not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
|