Next: Standard Syntax Module [Contents]
vasm is a portable and retargetable assembler able to create linkable objects in different formats as well as absolute code. Different CPU-, syntax and output-modules are supported.
Most common directives/pseudo-opcodes are supported (depending on the syntax module) as well as CPU-specific extensions.
The assembler supports optimizations and relaxations (e.g. choosing the shortest possible branch instruction or addressing mode as well as converting a branch to an absolute jump if necessary).
The concept is that you get a special vasm binary for any combination of
CPU- and syntax-module. All output modules, which make sense for the
selected CPU, are included in the vasm binary and you have to make sure to
choose the output file format you need (refer to the next chapter and look for
the -F option). The default is a test output format, only useful for
debugging or analyzing the output.
vasm is copyright in 2002-2023 by Volker Barthelmann.
This archive may be redistributed without modifications and used for non-commercial purposes.
An exception for commercial usage is granted, provided that the target CPU is M68k and the target OS is AmigaOS. Resulting binaries may be distributed commercially without further licensing.
In all other cases you need my written consent.
Certain modules may fall under additional copyrights.
Responsible for the current version of vasm and contact address in case of bug reports or support requests:
The vasm binaries do not need additional files, so no further installation is necessary. To use vasm with vbcc, copy the binary to vbcc/bin after following the installation instructions for vbcc.
The vasm binaries are named vasm<cpu>_<syntax> with
<cpu> representing the CPU-module and <syntax>
the syntax-module, e.g. vasm for PPC with the standard syntax
module is called vasmppc_std.
Sometimes the syntax-modifier may be omitted, e.g. vasmppc.
Detailed instructions how to build vasm can be found in the last chapter.
This chapter describes the module-independent part of the assembler. It documents the options and extensions which are not specific to a certain target, syntax or output driver. Be sure to also read the chapters on the cpu-backend, syntax- and output-module you are using. They will likely contain important additional information like data-representation or additional options.
Up: The Assembler [Contents]
vasm is run from the command line using the following syntax:
vasm<target>_<syntax> [options] [sourcefile]
When the source file name is missing, then the assembler reads the
source text from stdin until EOF (end-of-file,
which is CTRL-D in Unix shells, or CTRL-\ in AmigaOS
shells). Note, that most debugging formats (DWARF, etc.) no longer
work with such temporary source texts.
The following options are supported by the machine independent part
of vasm:
Issues a warning when a label matches a mnemonic or directive name in either upper or lower case.
Defines a symbol with the name <name> and assigns the value of the expression when given. The assigned value defaults to 1 otherwise.
Print all dependencies while assembling the source with the given
options. No output is generated. <type> may be the word list
for printing one file name in each new line, or make for
printing a sequence of file names on a single line, suitable for
Makefiles.
When the output file name is given by -o outname then
vasm will also print outname: in front of it.
Note that in contrast to option -dependall only relative
include file dependencies will be listed (which is the common case).
Prints dependencies in the same way as -depend, but also prints all include files with absolute paths.
Used together with -depend or -dependall and instructs vasm to output all dependencies to a new file, instead of stdout. Additionally, code will be generated in parallel to the dependencies output.
Automatically generate DWARF debugging sections, suitable for
source level debugging. When the version specification is missing,
DWARF V3 will be emitted. The only difference to V2 is that it
creates a .debug_ranges section, with address ranges for all
sections, instead of a bad workaround specifying
DW_AT_low_pc=0 and DW_AT_high_pc=~0.
Note, that when you build vasm from source, you may have to specify
your host operating system with -Dname in the Makefile to
include the appropriate code which can determine the current
work directory. Otherwise the default would be to set the current
work directory to an empty string. Currently supported are:
AMIGA, ATARI, MSDOS, UNIX,
_WIN32.
Enable escape character sequences. This will make vasm treat the escape character \ in string constants similar as in the C language.
Use module <fmt> as output driver. See the chapter on output drivers for available formats and options.
Define another include path. They are searched in the order of occurrence on the command line, and always before any include paths defined in the source.
When the same file is included multiple times, using the same path, this is silently ignored, causing the file to be processed only once. Note, that you can still include the same file twice when using different paths to access it.
Enables generation of a listing file and directs the output into the file <listfile>.
List all symbols, including unused equates. Default is to list all labels and all used expressions only.
Set the maximum number of bytes per line in a listing file to <n>.
Defaults to 8 (fmt=wide).
Set the listing file format to <fmt>. Defaults to wide.
Available are: wide, old.
Show only program labels in the sorted symbol listing. Default is to list all symbols, including absolute expressions.
Do not show included source files in the listing file (fmt=wide).
Do not include symbols in the listing file (fmt=wide).
Sets the maximum number of errors to display before assembly is aborted. When <n> is 0 then there is no limit. Defaults to 5.
Defines the maximum number of recursion levels within a macro. Defaults to 1000.
Adjusts the maximum number of passes while resolving a section. Defaults to 1500.
Disables case-sensitivity for everything - identifiers, directives and instructions. Note that directives and instructions may already be case-insensitive by default in some modules.
Do not search for include files relative to the compile directory (where the main input source is located).
No escape character sequences. This will make vasm treat the escape character \ as any other character. Might be useful for compatibility.
Perform no automatic alignment for instructions. Note that unaligned instructions make your code crash when executed! Only set when you know what you are doing!
Disable the informational message <n>. <n> has to be the number of a valid informational message, like an optimization message.
Strips all local symbols from the output file and doesn’t include any other symbols than those which are required for external linkage.
Disable warning message <n>. <n> has to be the number of a valid warning message, otherwise an error is generated.
Write the generated assembler output to <ofile> rather than a.out.
The given padding value can be one or multiple bytes (up to the
cpu-backend’s address size). It is used for alignment purposes
and to fill gaps between absolute ORG sections in the
binary output module. Defaults to a zero-byte.
Try to generate position independent code. Every relocation entry is flagged by an error message.
Do not print the copyright notice and the final statistics.
Sections are no longer distinguished by their name, but only by their attributes. This has the effect that when defining a second section with a different name but same attributes as a first one, it will switch to the first, instead of starting a new section. Is set automatically, when using an output-module which doesn’t support section names. For example: aout, tos, xfile.
The shift-right operator (>>) treats the value to shift as
unsigned, which has the effect that only 0-bits are inserted on the
left side. The number of bits in a value depend on the target
address type (refer to the appropriate cpu module documentation).
Uninitialized memory regions, declared by "space" directives
(.space in std-syntax, ds in mot-syntax, etc.)
are filled with the given value. Defaults to zero.
Hide all warning messages.
The return code of vasm will no longer be 0 (success), when there was a warning. Errors always make the return code non-zero (failure).
Print version and copyright messages from the assembler and all its modules, then exit.
Show an error message, when referencing an undefined symbol. The default behaviour is to declare this symbol as externally defined.
Note, that while most options allow an argument without any separating blank, some others require it (e.g. -o and -L).
Standard expressions are usually evaluated by the main part of vasm rather than by one of the modules (unless this is necessary).
All expressions evaluated by the frontend are calculated in terms of target address values, i.e. the range depends on the backend. Constants which exceed the target address range may be supported by some backends up to 128 bits.
Backends also have the option to support floating point constants directly and convert them to a backend-specific format which is described in the backend’s documentation.
Warning: Be aware that the quality and precision of the backend’s floating point output depends on the combination of host- and backend-format! If you need absolute precision, encode the floating point constants yourself in binary.
The available operators include all those which are common in assembler as well as in C expressions.
C like operators:
+ - ! ~
+ - * / % << >>
& | ^
&& ||
< > <= >= == !=
Assembler like operators:
+ - ~
+ - * / // << >>
& ! ~
< > <= >= = <>
Up to version 1.4b the operators had the same precedence and associativity as in the C language. Newer versions have changed the operator priorities to comply with common assembler behaviour. The expression evaluation priorities, from highest to lowest, are:
+ - ! ~ (unary +/- sign, not, complement)
<< >> (shift left, shift right)
& (bitwise and)
^ ~ (bitwise exclusive-or)
| ! (bitwise inclusive-or)
* / % // (multiply, divide, modulo)
+ - (plus, minus)
< > <= >= (less, greater, less or equal, greater or equal)
== = != <> (equality, inequality)
&& (logical and)
|| (logical or)
Operands are integral values of the target address type. They can either be
specified as integer constants of different bases (see the documentation
on the syntax module to see how the base is specified) or character
constants. Character constants are introduced by ' or "
and have to be terminated by the same character that started them.
Multiple characters are allowed and a constant is built according to the endianness of the target.
When the -esc option was specified, or automatically enabled by a syntax module, vasm interprets escape character sequences as in the C language:
\\Produces a single \.
\bThe bell character.
\fForm feed.
\nLine feed.
\rCarriage return.
\tTabulator.
\"Produces a single ".
\'Produces a single '.
\eEscape character (27).
\<octal-digits>One character with the code specified by the digits as octal value.
\x<hexadecimal-digits>One character with the code specified by the digits as hexadecimal value.
\X<hexadecimal-digits>Same as \x.
Note, that the default behaviour of vasm has changed since V1.7! Escape
sequence handling has been the default in older versions. This was
changed to improve compatibility with other assemblers. Use -esc
to assemble sources with escape character sequences. It is still the
default in the std syntax module, though.
You can define as many symbols as your available memory permits. A symbol may have any length and can be of global or local scope. Internally, there are three types of symbols:
ExpressionThese symbols are usually not visible outside the source, unless they are explicitly exported.
LabelLabels are always addresses within a program section. By default they have local scope for the linker.
ImportedThese symbols are externally defined and must be resolved by the linker.
Beginning with vasm V1.5c at least one expression symbol is always defined
to allow conditional assembly depending on the assembler being used:
__VASM. Its value depends on the selected cpu module.
Since V1.8i there may be a second internal symbol which reflects the format of the paths in the host file system. Currently there may be one of:
__UNIXFSHost file system uses Unix-style paths.
__MSDOSFSHost file system uses MS-DOS-, Windows-, Atari-style paths.
__AMIGAFSHost file system uses AmigaDOS-style paths.
Note that such a path-style symbol only depends on a -D option
given while compiling vasm from source. Refer to the section about
building vasm (Interface chapter) for a listing of all supported host
OS options.
There may be other internal symbols, which are defined by the syntax- or by the cpu module.
Vasm supports include files and defining include paths. Whether this functionality is available depends on the syntax module, which has to provide the appropriate directives.
On startup vasm defines one or two default include paths: the current work directory and, when the main source is not located there, the compile directory.
Include paths are searched in the following order:
Additionally, all the relative paths, defined by -I or directives, are first appended to the current work directory name, then to the compile directory name, while searching for an include file.
Searching for include files in paths based on the compile directory can be completely disabled by -nocompdir.
Macros are supported by vasm, but the directives for defining them have
to be implemented in the syntax module. The assembler core supports 9
macro arguments by default to be passed in the operand field,
which can be extended to any number by the syntax module.
They can be referenced inside the macro either by name (\name) or by
number (\1 to \9), or both, depending on the syntax module.
Recursions and early exits are supported.
Refer to the selected syntax module for more details.
Vasm supports structures, but the directives for defining them have to be implemented in the syntax module.
Has to be provided completely by the syntax module.
Some known module-independent problems of vasm at the moment:
All those who wrote parts of the vasm distribution, made suggestions,
answered my questions, tested vasm, reported errors or were otherwise
involved in the development of vasm (in descending alphabetical order,
under work, not complete):
The frontend has the following error messages:
Next: Mot Syntax Module, Previous: The Assembler [Contents]
This chapter describes the standard syntax module which is available
with the extension std.
This module is written in 2002-2023 by Volker Barthelmann and is covered by the vasm copyright without modifications.
This syntax module provides the following additional options:
Immediately allocate common symbols in the .bss/.sbss
section and define them as externally visible.
Enforces the backend’s natural alignment for all data directives
(.word, .long, .float, etc.).
Enable GNU-as compatibility mode. Currently this will only prevent labels prefixed by a dot to be recognized as local labels.
Recognize assembly directives without a leading dot (.).
Put data up to a maximum size of n bytes into the small-data sections. Default is n=0, which means the function is disabled.
Labels always have to be terminated by a colon (:), therefore
they don’t necessarily have to start at the first column of a line.
Local labels may either be preceded by a ’.’ (unless option
-gas was given) or terminated by ’$’,
and consist out of digits only.
These labels exist and keep their value between two global label definitions.
A special form of reusable "local" labels, independent of global labels,
may be defined by using a single digit from 0 to 9.
You can reference the nearest previous digit-label with Nb and the
nearest following digit-label with Nf, where N is such a digit.
Make sure that you don’t define a label on the same line as a directive for conditional assembly (if, else, endif)! This is not supported.
The operands are separated from the mnemonic by whitespace.
Multiple operands are separated by comma (,).
Comments are introduced by the comment character #. The rest
of the line will be ignored. For the c16x, m68k, 650x, ARM, Z80, 6800,
6809, QNICE and Jaguar-RISC backends, the comment character is ;
instead of #, although # is still allowed when being the
first non-blank character on a line.
Example:
mylabel: inst.q1.q2 op1,op2,op3 # comment
In expressions, numbers starting with 0x or 0X are
hexadecimal (e.g. 0xfb2c). 0b or 0B introduces
binary numbers (e.g. 0b1100101). Other numbers starting with
0 are assumed to be octal numbers, e.g. 0237. All
numbers starting with a non-zero digit are decimal, e.g. 1239.
C-like escape characters in string constants are allowed by default, unless disabled by -noesc.
All directives are case-insensitive. The following directives are supported by this syntax module (if the CPU- and output-module allow it):
.2byte <exp1>[,<exp2>...]See .uahalf.
.4byte <exp1>[,<exp2>...]See .uaword.
.8byte <exp1>[,<exp2>...]See .uaquad.
.ascii <exp1>[,<exp2>,"<string1>"...]See .byte.
.abort <message>Print an error and stop assembly immediately.
.asciiz "<string1>"[,"<string2>"...]See .string.
.align <bitorbyte_count>[,<fill>][,<maxpad>]Depending on the current CPU backend .align either behaves
like .balign (x86) or like .p2align (PPC).
.balign <byte_count>[,<fill>][,<maxpad>]Insert as much fill bytes as required to reach an address which
is dividable by <byte_count>. For example .balign 2 would
make an alignment to the next 16-bit boundary.
The padding bytes are initialized by <fill>, when given. The optional
third argument defines a maximum number of padding bytes to use. When
more are needed then the alignment is not done at all.
.balignl <bit_count>[,<fill>][,<maxpad>]Works like .balign, with the only difference that the optional
fill value can be specified as a 32-bit word. Padding locations which
are not already 32-bit aligned, will cause a warning and padded by
zero-bytes.
.balignw <bit_count>[,<fill>][,<maxpad>]Works like .balign, with the only difference that the optional
fill value can be specified as a 16-bit word. Padding locations which
are not already 16-bit aligned, will cause a warning and padded by
zero-bytes.
.byte <exp1>[,<exp2>,"<string1>"...]Assign the integer or string constant operands into successive bytes of memory in the current section. Any combination of integer and character string constant operands is permitted.
.comm <symbol>,<size>[,<align>]Defines a common symbol which has a size of <size> bytes. The
final size and alignment is assigned by the linker, which
will use the highest size and alignment values of all common
symbols with the same name found. A common symbol is usually
allocated in the .bss section of the final executable.
In case the optional <align> argument is not given,
.comm-areas of less than 8 bytes in size are aligned
to word boundaries, otherwise to doubleword boundaries.
.double <exp1>[,<exp2>...]Parse one of more IEEE double precision floating point expressions and write them into successive blocks of 8 bytes into memory using the backend’s endianness.
.elseAssemble the following block only if the previous .if
condition was false.
.elseif <exp>Same as .else followed by .if, but without the
need for an .endif. Avoids nesting.
.endifEnds a block of conditional assembly.
.endmEnds a macro definition.
.endrEnds a repetition block.
.equ <symbol>,<expression>See .set.
.equiv <symbol>,<expression>Assign the <expression> to <symbol> similar to .equ and
.set, but signals an error when <symbol> has already been
defined.
.err <message>Print a user error message. Do not create an output file.
.extern <symbol>[,<symbol>...]See .global.
.fail <expression>Cause a warning when <expression> is greater or equal 500. Otherwise cause an error.
.file "string"Set the filename of the input source. This may be used by some output modules. By default, the input filename passed on the command line is used.
.float <exp1>[,<exp2>...]Parse one of more IEEE single precision floating point expressions and write them into successive blocks of 4 bytes into memory using the backend’s endianness.
.global <symbol>[,<symbol>...]Flag <symbol> as an external symbol, which means that <symbol> is visible to all modules in the linking process. It may be either defined or undefined.
.globl <symbol>[,<symbol>...]See .global.
.half <exp1>[,<exp2>...]Assign the values of the operands into successive 16-bit words of memory in the current section using the backend’s endianness.
.if <expression>Conditionally assemble the following lines if <expression> is non-zero.
.ifeq <expression>Conditionally assemble the following lines if <expression> is zero.
.ifne <expression>Conditionally assemble the following lines if <expression> is non-zero.
.ifgt <expression>Conditionally assemble the following lines if <expression> is greater than zero.
.ifge <expression>Conditionally assemble the following lines if <expression> is greater than zero or equal.
.iflt <expression>Conditionally assemble the following lines if <expression> is less than zero.
.ifle <expression>Conditionally assemble the following lines if <expression> is less than zero or equal.
.ifb <operand>Conditionally assemble the following lines when <operand> is completely blank, except an optional comment.
.ifnb <operand>Conditionally assemble the following lines when <operand> is non-blank.
.ifdef <symbol>Conditionally assemble the following lines if <symbol> is defined.
.ifndef <symbol>Conditionally assemble the following lines if <symbol> is undefined.
.incbin <file>Inserts the binary contents of <file> into the object code at this position.
.incdir <path>Add another path to search for include files to the list of known paths. Paths defined with -I on the command line are searched first.
.include <file>Include source text of <file> at this position.
.int <exp1>[,<exp2>...]Assign the values of the operands into successive words of memory in the current section using the target’s endianness and address size.
.irp <symbol>[,<val>...]Iterates the block between .irp and .endr for each
<val>. The current <val>, which may be embedded in quotes,
is assigned to \symbol. If no value is given, then the block is
assembled once, with \symbol set to an empty string.
.irpc <symbol>[,<val>...]Iterates the block between .irp and .endr for each
character in each <val>, and assign it to \symbol.
If no value is given, then the block is assembled once, with
\symbol set to an empty string.
.lcomm <symbol>,<size>[,<alignment>]Allocate <size> bytes of space in the .bss section and assign the value to that location to <symbol>. If <alignment> is given, then the space will be aligned to an address having <alignment> low zero bits or 2, whichever is greater. <symbol> may be made globally visible by the .globl directive.
.listThe following lines will appear in the listing file, when enabled.
.local <symbol>[,<symbol>...]Flag <symbol> as a local symbol, which means that <symbol> is local for the current file and invisible to other modules in the linking process.
.long <exp1>[,<exp2>...]Assign the values of the operands into successive 32-bit words of memory in the current section using the backend’s endianness.
.macro <name> [<argname1>[=<default>][,<argname2>...]]Defines a macro, which can be referenced by <name>. The macro
definition is closed by an .endm directive. The argument
names, which may be passed to this macro, must be declared directly
following the macro name, separated by white-space. You can define an
optional default value in the case an argument is left out.
Note that macro names are case-insensitive while the argument
names are case-sensitive.
Within the macro context arguments are referenced by \argname.
The special argument \@ inserts a unique id,
useful for defining labels.
\() may be used as a separator between the name of a macro
argument and the subsequent text.
.nolistThe following lines will not be visible in a listing file.
.org <exp>[,<fill>]Before any other section directive <exp> defines the absolute
start address of the program. Within a section <exp> defines
the offset from the start of this section for the subsequent code.
The optional <fill> value is only valid within a section and is used
to fill the space to the new program counter (defaults to zero).
When <exp> starts with a current-pc symbol followed by a plus (+)
operator, then the directive just reserves space (filled with zero).
.p2align <bit_count>[,<fill>][,<maxpad>]Insert as much fill bytes as required to reach an address where
<bit_count> low order bits are zero. For example .p2align 2 would
make an alignment to the next 32-bit boundary.
The padding bytes are initialized by <fill>, when given. The optional
third argument defines a maximum number of padding bytes to use. When
more are needed then the alignment is not done at all.
.p2alignl <bit_count>[,<fill>][,<maxpad>]Works like .p2align, with the only difference that the optional
fill value can be specified as a 32-bit word. Padding locations which
are not already 32-bit aligned, will cause a warning and padded by
zero-bytes.
.p2alignw <bit_count>[,<fill>][,<maxpad>]Works like .p2align, with the only difference that the optional
fill value can be specified as a 16-bit word. Padding locations which
are not already 16-bit aligned, will cause a warning and padded by
zero-bytes.
.popsectionRestore the top section from the internal section-stack.
Also refer to .pushsection.
.pushsection <name>[,"<attributes>"][[,@<type>]|[,%<type>]|[,<mem_flags>]]Works exactly like .section, but additionally pushes
the previously active section onto an internal stack, where it may be
restored from by the .popsection directive.
.quad <exp1>[,<exp2>...]Assign the values of the operands into successive quadwords (64-bit) of memory in the current section using the backend’s endianness.
.rept <expression>Repeats the assembly of the block between .rept and .endr
<expression> number of times. <expression> has to be positive.
.section <name>[,"<attributes>"][[,@<type>]|[,%<type>]|[,<mem_flags>]]Starts a new section named <name> or reactivate an old one. If
attributes are given for an already existing section, they must
match exactly. The section’s name will also be defined as a new
symbol, which represents the section’s start address.
The "<attributes>" string may consist of the following characters:
Section Contents:
csection has code
dsection has initialized data
usection has uninitialized data
isection has directives (info or offsets section)
nsection can be discarded
Rremove section at link time
asection is allocated in memory
Section Protection:
rsection is readable
wsection is writable
xsection is executable
ssection is shareable
Section Alignment: A digit, which is ignored. The assembler will automatically align the section to the highest alignment restriction used within.
Memory flags (Amiga hunk format only):
Cload section to Chip RAM
Fload section to Fast RAM
The optional <type> argument is mainly used for ELF output
and may be introduced either by a ’%’ or a ’@’ character.
Allowed are:
progbitsThis is the default value, which means the section data occupies space in the file and may have initialized data.
nobitsThese sections do not occupy any space in the file and will be allocated filled with zero bytes by the OS loader.
When the optional, non-standard, <mem_flags> argument is given
it defines a 32-bit memory attribute, which defines where to load
the section (platform specific).
The memory attributes are currently only used in the hunk-format
output module.
.set <symbol>,<expression>Create a new program symbol with the name <symbol> and assign to it the value of <expression>. If <symbol> is already assigned, it will contain a new value from now on.
.size <symbol>,<size>Set the size in bytes of an object defined at <symbol>.
.short <exp1>[,<exp2>...]Assign the values of the operands into successive 16-bit words of memory in the current section using the backend’s endianness.
.single <exp1>[,<exp2>...]Parse one of more IEEE single precision floating point expressions and write them into successive blocks of 4 bytes into memory using the backend’s endianness.
.skip <exp>[,<fill>]Insert <exp> zero or <fill> bytes into the current section.
.space <exp>[,<fill>]Insert <exp> zero or <fill> bytes into the current section.
.stabs "<name>",<type>,<other>,<desc>,<exp>Add an stab-entry for debugging, including a symbol-string and an expression.
.stabn <type>,<other>,<desc>,<exp>Add an stab-entry for debugging, without a symbol-string.
.stabd <type>,<other>,<desc>Add an stab-entry for debugging, without symbol-string and value.
.string "<string1>"[,"<string2>"...]Like .byte, but adds a terminating zero-byte.
.swbeg <op>Just for compatibility. Do nothing.
.type <symbol>,<type>Set type of symbol named <symbol> to <type>, which must be one of:
1: Object2: Function3: Section4: FileThe predefined symbols @object and
@function are available for
this purpose.
.uahalf <exp1>[,<exp2>...]Assign the values of the operands into successive 16-bit areas of memory in the current section regardless of current alignment.
.ualong <exp1>[,<exp2>...]Assign the values of the operands into successive 32-bit areas of memory in the current section regardless of current alignment.
.uaquad <exp1>[,<exp2>...]Assign the values of the operands into successive 64-bit areas of memory in the current section regardless of current alignment.
.uashort <exp1>[,<exp2>...]Assign the values of the operands into successive 16-bit areas of memory in the current section regardless of current alignment.
.uaword <exp1>[,<exp2>...]Assign the values of the operands into successive 16-bit areas of memory in the current section regardless of current alignment.
.weak <symbol>[,<symbol>...]Flag <symbol> as a weak symbol, which means that <symbol> is visible to all modules in the linking process and may be replaced by any global symbol with the same name. When a weak symbol remains undefined its value defaults to 0.
.word <exp1>[,<exp2>...]Assign the values of the operands into successive 16-bit words of memory in the current section using the backend’s endianness.
.zero <exp>[,<fill>]Insert <exp> zero or <fill> bytes into the current section.
Predefined section directives:
.bss.section ".bss","aurw"
.data.section ".data","adrw"
.rodata.section ".rodata","adr"
.sbss.section ".sbss","aurw"
.sdata.section ".sdata","adrw"
.sdata2.section ".sdata2","adr"
.stab.section ".stab","dr"
.stabstr.section ".stabstr","dr"
.text.section ".text","acrx"
.tocd.section ".tocd","adrw"
.dpage.section ".dpage","adrw"
Some known problems of this module at the moment:
This module has the following error messages:
Next: Madmac Syntax Module, Previous: Standard Syntax Module [Contents]
This chapter describes the Motorola syntax module, mostly used for the
M68k and ColdFire families of CPUs, which is available with the extension
mot.
This module is written in 2002-2023 by Frank Wille and is covered by the vasm copyright without modifications.
This syntax module provides the following additional options:
Enables natural alignment for data (e.g. dc, ds) and
offset directives (rs, so, fo).
Makes all 35 macro arguments available. Default are 9 arguments (\1
to \9). More arguments can be accessed through \a to
\z), which may conflict with escape characters or named arguments,
therefore they are not enabled by default.
Sets a two-byte code used for alignment padding with cnop in
code sections. Defaults to 0x4e71 on M68k.
Devpac-compatibility mode. Only directives known to Devpac are recognized.
__RS, __SO and __FO as
0, which otherwise are undefined until first referenced.
NOP instructions when aligning code
(see -cnop=).
Allow dots (.) within all identifiers.
Local symbols are prefixed by '_' instead of '.'. For
Devpac compatibility, which offers a similar option.
Disables local symbols to be recognized by their prefix (usually
'.'). This allows global symbols to be defined with it.
The '$' suffix for local symbols still works.
PhxAss-compatibility mode. Only directives known to PhxAss are recognized.
_PHXASS_ with value 2 (to differentiate
from the real PhxAss with value 1).
Allow whitespace characters in the operand field. Otherwise a whitespace would start the comment field there.
Warn about all lines, which have comments in the operand field, introduced
by a whitespace character. For example in: dc.w 1 + 2.
Labels must either start at the first column of a line or have to be
terminated by a colon (:). In the first case the mnemonic
has to be separated from the label by whitespace (not required in any case,
e.g. with the = directive). A double colon (::)
automatically makes the label externally visible (see also xdef).
Local labels are either prefixed by ’.’ or suffixed by ’$’.
For the rest, any alphanumeric character including ’_’ is allowed.
Local labels are valid between two global label definitions.
Otherwise dots (.) are not allowed within a label by default, unless
the option -ldots or -devpac was specified. Even then,
labels ending on .b, .w or .l can’t be defined.
It is possible to refer to any local symbol in the source by preceding its
name with the name of the last previously defined global symbol:
global_name\local_name. This is for PhxAss compatibility only,
and is no recommended style. Does not work in a macro, as it conflicts
with macro arguments.
Make sure that you don’t define a label on the same line as a directive for conditional assembly (if, else, endif)! This is not supported.
Qualifiers are appended to the mnemonic,
separated by a dot (if the CPU-module supports qualifiers). The
operands are separated from the mnemonic by whitespace. Multiple
operands are separated by comma (,).
In this syntax module, the operand field must not contain any whitespace characters, as long as the option -spaces was not specified.
Comments can be introduced everywhere by the characters ; or *.
The rest of the line will be ignored. Also everything following the operand
field, separated by a whitespace, will be regarded as comment (unless
-spaces was given). Be careful with *, which is either
recognized as the "current pc symbol" or as a multiplication operation
in any operand expression
Example:
mylabel inst.q op1,op2,op3 ;comment
In expressions, numbers starting with $ are hexadecimal (e.g.
$fb2c). % introduces binary numbers (e.g. %1100101).
Numbers starting with @ are assumed to be octal numbers, e.g.
@237. All numbers starting with a digit are decimal, e.g.
1239.
The following directives are supported by this syntax module (provided the CPU- and output-module support them):
<symbol> = <expression>Equivalent to <symbol> equ <expression>.
<symbol> =.s <expression>Equivalent to <symbol> fequ.s <expression>. PhxAss compatibility.
<symbol> =.d <expression>Equivalent to <symbol> fequ.d <expression>. PhxAss compatibility.
<symbol> =.x <expression>Equivalent to <symbol> fequ.x <expression>. PhxAss compatibility.
<symbol> =.p <expression>Equivalent to <symbol> fequ.p <expression>. PhxAss compatibility.
align <bitcount>Insert as many zero bytes as required to reach an address where
<bitcount> low order bits are zero. For example align 2 would
make an alignment to the next 32-bit boundary.
assert <expression>[,<message>]Display an error with the optional <message> when the expression is false.
blk.b <exp>[,<fill>]Equivalent to dcb.b <exp>,<fill>.
blk.d <exp>[,<fill>]Equivalent to dcb.d <exp>,<fill>.
blk.l <exp>[,<fill>]Equivalent to dcb.l <exp>,<fill>.
blk.q <exp>[,<fill>]Equivalent to dcb.q <exp>,<fill>.
blk.s <exp>[,<fill>]Equivalent to dcb.s <exp>,<fill>.
blk.w <exp>[,<fill>]Equivalent to dcb.w <exp>,<fill>.
blk.x <exp>[,<fill>]Equivalent to dcb.x <exp>,<fill>.
bssEquivalent to section bss,bss.
bss_cEquivalent to section bss_c,bss,chip.
bss_fEquivalent to section bss_f,bss,fast.
cargs [#<offset>,]<symbol1>[.<size1>][,<symbol2>[.<size2>]]...Defines <symbol1> with the value of <offset>. Further symbols
on the line, separated by comma, will be assigned the <offset> plus
the size of the previous symbol. The size defaults to 2. Valid
optional size extensions are: .b, .w, .l,
where .l results in a size of 4, the others 2.
The <offset> argument defaults to the target’s address size
(4 for M68k) when omitted.
clrfoReset stack-frame offset counter to zero. See fo directive.
clrsoReset structure offset counter to zero. See so directive.
cnop <offset>,<alignment>Insert as many padding bytes as required to reach an address which can be divided by <alignment>. Then add <offset> padding bytes. May fill the padding-bytes with no-operation instructions for certain cpus.
codeEquivalent to section code,code.
code_cEquivalent to section code_c,code,chip.
code_fEquivalent to section code_f,code,fast.
comm <symbol>,<size>Create a common symbol with the given size. The alignment is always 32 bits.
commentStarting with the operand field everything is ignored and
seen as a comment.
There is only one exception, when the operand contains HEAD=.
Then the following expression is passed to the TOS output module
via the symbol ’ TOSFLAGS’, to define the Atari specific TOS
flags.
csegEquivalent to section code,code.
dataEquivalent to section data,data.
data_cEquivalent to section data_c,data,chip.
data_fEquivalent to section data_f,data,fast.
db <exp1>[,<exp2>,"<string1>",'<string2>'...]Equivalent to dc.b for ArgAsm, BAsm, HX68, Macro68, ProAsm, etc.
compatibility. Does not exist in PhxAss- or Devpac-compatiblity mode.
dc.b <exp1>[,<exp2>,"<string1>",'<string2>'...]Assign the integer or string constant operands into successive bytes of memory in the current section. Any combination of integer and character string constant operands is permitted.
dc.d <exp1>[,<exp2>...]Assign the values of the operands into successive 64-bit words of memory in the current section, using the IEEE double precision format when specifying them as floating point constants.
dc.l <exp1>[,<exp2>...]Assign the values of the operands into successive 32-bit words of memory in the current section.
dc.p <exp1>[,<exp2>...]Assign the values of the operands into successive 96-bit words of memory in the current section, using the Packed Decimal format when specifying them as floating point constants.
dc.q <exp1>[,<exp2>...]Assign the values of the operands into successive 64-bit words of memory in the current section.
dc.s <exp1>[,<exp2>...]Assign the values of the operands into successive 32-bit words of memory in the current section, using the IEEE single precision format when specifying them as floating point constants.
dc.w <exp1>[,<exp2>...]Assign the values of the operands into successive 16-bit words of memory in the current section.
dc.x <exp1>[,<exp2>...]Assign the values of the operands into successive 96-bit words of memory in the current section, using the IEEE extended precision format when specifying them as floating point constants.
dcb.b <exp>[,<fill>]Insert <exp> zero or <fill> bytes into the current section.
dcb.d <exp>[,<fill>]Insert <exp> zero or <fill> 64-bit words into the current section. <fill> may also be a floating point constant which is then written in IEEE double precision format.
dcb.l <exp>[,<fill>]Insert <exp> zero or <fill> 32-bit words into the current section.
dcb.q <exp>[,<fill>]Insert <exp> zero or <fill> 64-bit words into the current section.
dcb.s <exp>[,<fill>]Insert <exp> zero or <fill> 32-bit words into the current section. <fill> may also be a floating point constant which is then written in IEEE single precision format.
dcb.w <exp>[,<fill>]Insert <exp> zero or <fill> 16-bit words into the current section.
dcb.x <exp>[,<fill>]Insert <exp> zero or <fill> 96-bit words into the current section. <fill> may also be a floating point constant which is then written in IEEE extended precision format.
dl <exp1>[,<exp2>...]Equivalent to dc.l for ArgAsm, BAsm, HX68, Macro68, ProAsm, etc.
compatibility. Does not exist in PhxAss- or Devpac-compatiblity mode.
dr.b <exp1>[,<exp2>...]Calculates <expN> - <current pc value> and stores it into successive bytes of memory in the current section.
dr.w <exp1>[,<exp2>...]Calculates <expN> - <current pc value> and stores it into successive 16-bit words of memory in the current section.
dr.l <exp1>[,<exp2>...]Calculates <expN> - <current pc value> and stores it into successive 32-bit words of memory in the current section.
ds.b <exp>Equivalent to dcb.b <exp>,0.
ds.d <exp>Equivalent to dcb.d <exp>,0.
ds.l <exp>Equivalent to dcb.l <exp>,0.
ds.q <exp>Equivalent to dcb.q <exp>,0.
ds.s <exp>Equivalent to dcb.s <exp>,0.
ds.w <exp>Equivalent to dcb.w <exp>,0.
ds.x <exp>Equivalent to dcb.x <exp>,0.
dsegEquivalent to section data,data.
dw <exp1>[,<exp2>...]Equivalent to dc.w for ArgAsm, BAsm, HX68, Macro68, ProAsm, etc.
compatibility. Does not exist in PhxAss- or Devpac-compatiblity mode.
dx.b <exp>Tries to allocate space in the DataBss portion of a code or
data section. Otherwise equivalent to dcb.b <exp>,0.
dx.d <exp>Tries to allocate space in the DataBss portion of a code or
data section. Otherwise equivalent to dcb.d <exp>,0.
dx.l <exp>Tries to allocate space in the DataBss portion of a code or
data section. Otherwise equivalent to dcb.l <exp>,0.
dx.q <exp>Tries to allocate space in the DataBss portion of a code or
data section. Otherwise equivalent to dcb.q <exp>,0.
dx.s <exp>Tries to allocate space in the DataBss portion of a code or
data section. Otherwise equivalent to dcb.s <exp>,0.
dx.w <exp>Tries to allocate space in the DataBss portion of a code or
data section. Otherwise equivalent to dcb.w <exp>,0.
dx.x <exp>Tries to allocate space in the DataBss portion of a code or
data section. Otherwise equivalent to dcb.x <exp>,0.
echo <"string"|exp>[,<"string"|exp>]...Prints one or more strings or expressions to stdout, terminated by a newline. Strings are identified by single- or double-quotes. In PhxAss-comapatibility mode only a single string can be printed.
einlineEnd a block of isolated local labels, started by inline.
elseAssemble the following lines if the previous if condition
was false.
elseifSame as else, for compatibility!
elif <exp>This is a real else-if directive! Not supported by Devpac.
It’s the same as else followed by if, but without the
need for a matching endif directive. Avoids nesting.
endAssembly will terminate with this line. The subsequent source text is ignored.
endifEnds a section of conditional assembly.
endmEnds a macro definition.
endrEnds a repetition block.
<symbol> equ <expression>Define a new program symbol with the name <symbol> and assign to it the value of <expression>. Defining <symbol> twice will cause an error.
<symbol> equ.s <expression>Equivalent to <symbol> fequ.s <expression>. PhxAss compatibility.
<symbol> equ.d <expression>Equivalent to <symbol> fequ.d <expression>. PhxAss compatibility.
<symbol> equ.x <expression>Equivalent to <symbol> fequ.x <expression>. PhxAss compatibility.
<symbol> equ.p <expression>Equivalent to <symbol> fequ.p <expression>. PhxAss compatibility.
eremEnds an outcommented block. Assembly will continue.
evenAligns to an even address. Equivalent to cnop 0,2.
fail <message>Show an error message including the <message> string. Do not generate an output file.
<symbol> fequ.s <expression>Define a new program symbol with the name <symbol> and assign to it the floating point value of <expression>. Defining <symbol> twice will cause an error. The extension is for Devpac-compatibility, but will be ignored.
<symbol> fequ.d <expression>Equivalent to <symbol> fequ.s <expression>.
<symbol> fequ.x <expression>Equivalent to <symbol> fequ.s <expression>.
<symbol> fequ.p <expression>Equivalent to <symbol> fequ.s <expression>.
<label> fo.<size> <expression>Assigns the current value of the stack-frame offset counter to <label>.
Afterwards the counter is decremented by the instruction’s <size>
multiplied by <expression>. Any valid M68k size extension is allowed
for <size>: b, w, l, q, s, d, x, p.
The offset counter can also be referenced directly under the name
__FO.
idnt <name>Sets the file or module name in the generated object file to <name>, when the selected output module supports it. By default, the input filename passed on the command line is used.
if <expression>Conditionally assemble the following lines if <expression> is non-zero.
if1Just for compatibility. Not really supported, as vasm parses a source text only once. Always true.
if2Just for compatibility. Not really supported, as vasm parses a source text only once. Always false.
ifeq <expression>Conditionally assemble the following lines if <expression> is zero.
ifne <expression>Conditionally assemble the following lines if <expression> is non-zero.
ifgt <expression>Conditionally assemble the following lines if <expression> is greater than zero.
ifge <expression>Conditionally assemble the following lines if <expression> is greater than zero or equal.
iflt <expression>Conditionally assemble the following lines if <expression> is less than zero.
ifle <expression>Conditionally assemble the following lines if <expression> is less than zero or equal.
ifb <operand>Conditionally assemble the following lines when <operand> is completely blank, except for an optional comment.
ifnb <operand>Conditionally assemble the following lines when <operand> is non-blank.
ifc <string1>,<string2>Conditionally assemble the following lines if <string1> matches <string2>.
ifnc <string1>,<string2>Conditionally assemble the following lines if <string1> does not match <string2>.
ifd <symbol>Conditionally assemble the following lines if <symbol> is defined.
ifnd <symbol>Conditionally assemble the following lines if <symbol> is undefined.
ifmacrod <macro>Conditionally assemble the following line if <macro> is defined.
ifmacrond <macro>Conditionally assemble the following line if <macro> is undefined.
ifp1Just for compatibility. Equivalent to if1.
iif <expression> <statement>Conditionally assemble the <statement> following <expression>.
IIF stands for Immediate IF.
If the value of <expression> is non-zero then <statement> is assembled.
No ENDC must be used in conjunction with this directive.
The <statement> cannot include a label, but a label may precede the
IIF directive. For example:
foo IIF bar equ 42
The foo label will be assigned with 42 if bar
evaluates to true, otherwise foo will be assigned with the
current program counter.
Assigning a value in the IIF <statement> using
the equal (=) operator, while the option -spaces
was given, cannot work, because the equal operator will be
evaluated as part of the expression.
I.e. foo IIF 1+1 = 42 works, but foo IIF 1 + 1 = 42,
when the option -spaces was specified, won’t, as
= 42 is evaluated as part of the expression.
incbin <filename>[,<offset>[,<length>]]Inserts the binary contents of <filename> into the object code at this position. When <offset> is specified, then the given number of bytes will be skipped at the beginning of the file. The optional <length> argument specifies the maximum number of bytes to be read from that file.
incdir <path>Add another path to search for include files to the list of known paths. Paths defined with -I on the command line are searched first.
include <filename>Include source text of <filename> at this position. When the file name specified has no absolute path, then search it in all defined paths in the order of occurrence, starting with the current work directory.
inlineLocal labels in the following block are isolated from previous
local labels and those after einline.
listThe following lines will appear in the listing file, if it was requested.
llen <len>Set the line length in a listing file to a maximum of <len> characters. Currently without any effect.
localSeparates two blocks of local labels. Which means, local labels from above this directive may be reused.
macro <name>Defines a macro which can be referenced by <name>. For compatibility,
the <name> may alternatively appear on the left side of
the macro directive, starting on the first column.
Then the operand field is ignored. The macro definition is terminated
by an endm directive. When calling a macro you may pass
up to 9 arguments, separated by comma. These arguments are
referenced within the macro context as \1 to \9.
Parameter \0 is set to the macro’s first qualifier
(mnemonic extension), when given.
In Devpac- and PhxAss-compatibility mode, or with option
-allmp, up to 35 arguments are accepted,
where argument 10-35 can be referenced by \a to \z.
In case you have a macro argument which contains commas or spaces you
may enclose it between < and > characters. A >
character may still be included by writing >>, or when
embedded within a string. (Note that strings are ignored in Devpac
compatibility mode.)
Special macro parameters:
\@Insert a unique id, useful for defining labels. Every macro call gets its own unique id.
\@!Push the current unique id onto a global id stack, then insert it.
\@?Push the current unique id below the top element of the global id stack, then insert it.
\@@Pull the top element from the global id stack and insert it. The macro’s current unique id is not affected by this operation.
\#Insert the number of arguments that have been passed to this macro.
Equivalent to the contents of the symbol NARG.
\?nInsert the length of the n’th macro argument.
\.Insert the argument which is selected by the current value of the
CARG symbol (first argument, when CARG is 1).
\+Works like \., but increments the value of CARG after
that.
\-Works like \., but decrements the value of CARG after
that.
\<symbolname>Inserts the current decimal value of the absolute
symbol symbolname.
\<$symbolname>Inserts the current hexadecimal value of the absolute
symbol symbolname, without leading $.
mexitLeave the current macro and continue with assembling the parent
context. Note that this directive also resets the level of conditional
assembly to a state before the macro was invoked; which means that
it also works as a ’break’ command on all new if directives.
msource on/offEnable or disable source level debugging within a macro context.
It can be used before one or more macro definitions.
When off, the debugger will show the invoking source text line
instead. Defaults to on. Also numeric expressions like
0 or 1 are allowed.
Note, that this directive currently only has a meaning when using
the -linedebug option with the hunk-format output module
(-Fhunk).
nolistThe following lines will not be visible in a listing file.
nopageNever start a new page in the listing file. This implementation will only prevent emitting the formfeed code.
nref <symbol>[,<symbol>...]Flag <symbol> as externally defined, similar to xref,
but also indicate that references can be optimized to base-relative
addressing modes, when possible. This directive is only present
in PhxAss-compatibility mode.
oddAligns to an odd address. Equivalent to cnop 1,2.
Bugs: Note that this is not a real odd directive, as it
wastes two bytes when the address is already odd.
offset [<expression>]Switches to a special offset-section, similar to a section
directive, although its contents is not included in the output.
Its labels may be referenced as absolute offset symbols.
Can be used to define structure offsets.
The optional <expression> gives the start offset for this section.
When missing, the last offset of the previous offset-section is used,
or 0.
<expression> must evaluate as a constant!
org <expression>Sets the base address for the subsequent code. Note that it is allowed
to embed such an absolute ORG block into a section. Return to
relocatable mode with any new section directive.
Although, in Devpac compatibility mode the previous section will
stay absolute.
output <name>Sets the output file name to <name> when no output name was
given on the command line. A special case for Devpac-compatibility
is when <name> starts with a '.' and an output name was
already given. Then the current output name gets <name>
appended as an extension. When an extension already exists,
then it is replaced.
pageStart a new page in the listing file (not implemented). Make sure to start a new page when the maximum page length is reached.
plen <len>Set the page length for a listing file to <len> lines.
Currently ignored.
printt <string>[,<string>...]Prints <string> to stdout. Every additional string into a new line.
Quotes are optional.
printv <expression>[,<expression>...]Evaluate <expression> and print it to stdout out in hexadecimal,
decimal, ASCII and binary format.
public <symbol>[,<symbol>...]Flag <symbol> as an external symbol, which means that
<symbol> is visible to all modules in the linking process.
It may be either defined or undefined.
popsectionRestore the top section from the internal section-stack and
activate it. Also refer to pushsection.
pushsectionPushes the current section onto an internal stack, where it may be
restored from by the popsection directive.
remThe assembler will ignore everything from encountering the rem
directive until an erem directive was found.
rept <expression>Repeats the assembly of the block between rept and endr
<expression> number of times. <expression> should be
positive. Negative values are regarded like 0.
The internal symbol REPTN always holds the iteration counter
of the inner repeat loop, starting with 0. REPTN is -1 outside
of any repeat block.
rorg <expression>[,<fill>]Sets the program counter to an offset relative to the start
of the current section, as defined by <expression>.
The new program counter (section offset) must not be smaller than the
current one. Any space will be padded by the optional <fill>
value, or zero.
<label> rs.<size> <expression>Works like the so directive, with the only difference that
the offset symbol is named __RS.
rsevenAlign the structure offset counter (__RS) to an even count.
rsresetEquivalent to clrso, but the symbol manipulated is __RS.
rsset <expression>Sets the structure offset counter (__RS) to <expression>.
See rs directive.
section <name>[,<sec_type>][,<mem_type>]Starts a new section named <name> or reactivates an old one.
<sec_type> defines the section type and may be code,
text (same as code), data or bss.
If the selected output format does not support section names (like
"aout", "tos" or "xfile"), then a missing <sec_type>
argument makes vasm interpret the first argument, <name>,
as section type instead. Otherwise a missing <sec_type>
defaults to a code section with the given name.
The optional <mem_type> has currently only a meaning for
the hunk-format output module and defines a 32-bit
memory attribute which specifies where to load the section.
<mem_type> is either a numerical constant or one of the
keywords chip (for Chip-RAM) or fast (for Fast-RAM).
Optionally it is also possible to attach the suffix _C, _F
or _P to the <sec_type> argument for defining the memory
type.
<symbol> set <expression>Create a new symbol with the name <symbol> and assign the value of <expression>. If <symbol> is already assigned, it will contain a new value from now on.
setfo <expression>Sets the structure offset counter (__FO) to <expression>.
See fo directive.
setso <expression>Sets the structure offset counter (__SO) to <expression>.
See so directive.
showoffset [<text>]Print current section offset (or absolute address) to the console,
preceded by the optional <text> (may use quotes).
PhxAss compatibility. Do not use in new code.
<label> so.<size> <expression>Assigns the current value of the structure offset counter to
<label>.
Afterwards the counter is incremented by the instruction’s
<size> multiplied by <expression>.
Any valid M68k size extension is allowed for <size>:
b, w, l, q, s, d, x,
p.
The offset counter can also be referenced directly under the name
__SO.
spc <lines>Output <lines> number of blank lines in the listing file.
Currently without any effect.
textEquivalent to section code,code.
ttl <name>PhxAss syntax. Equivalent to idnt <name>.
<name> ttlMotorola syntax. Equivalent to idnt <name>.
weak <symbol>[,<symbol>...]Flag <symbol> as a weak symbol, which means that <symbol>
is visible to all modules in the linking process, but may be replaced
by any global symbol with the same name.
When a weak symbol remains undefined its value defaults to 0.
xdef <symbol>[,<symbol>...]Flag <symbol> as a global symbol, which means that
<symbol> is visible to all modules in the linking process.
See also public.
xref <symbol>[,<symbol>...]Flag <symbol> as externally defined, which means it has to
be imported from another module into the linking process.
See also public.
Some known problems of this module at the moment:
odd directive wastes two bytes, when address is already odd.
echo, printt and printv do not work when the
source text doesn’t contain any real code.
This module has the following error messages:
Next: Oldstyle Syntax Module, Previous: Mot Syntax Module [Contents]
This chapter describes the madmac syntax module, which is compatible to the MadMac assembler syntax, written by Landon Dyer for Atari and improved later to support Jaguar and JRISC. It is mainly intended for Atari’s 6502, 68000 and Jaguar systems.
This module is written in 2015-2021 by Frank Wille and is covered by the vasm copyright without modifications.
A statement may contain up to four fields which are identified by order of appearance and terminating characters. The general form is:
label: operator operand(s) ; comment
Labels must not start at the first column, as they are identified by the
mandatory terminating colon (:) character. A double colon (::)
automatically makes the label externally visible.
Labels preceded by ’.’ have local scope and are only valid between
two global labels.
Equate directives, starting in the operator field, have a symbol without
terminating colon in the first field, left of the operator.
The equals-character (=) can be used as an alias for equ.
A double-equals (==) automatically makes the symbol externally
visible.
symbol equate expression ; comment
Identifiers, like symbols or labels, may start with any upper- or lower-case
character, a dot (.), question-mark (?) or underscore
(_). The remaining characters may be any alphanumeric character,
a dollar-sign ($), question-mark (?) or underscore (_).
The operands are separated from the operator by whitespace. Multiple
operands are separated by comma (,).
Comments are introduced by the comment character ;. The asterisk
(*) can be used at the first column to start a comment.
The rest of the line will be ignored.
In expressions, numbers starting with $ are hexadecimal (e.g.
$fb2c). % introduces binary numbers (e.g. %1100101).
Numbers starting with @ are assumed to be octal numbers, e.g.
@237.
All other numbers starting with a digit are decimal, e.g. 1239.
NOTE: Unlike the original Madmac assembler all expressions are evaluated following the usual mathematical operator priorities.
C-like escape characters are supported in strings.
The following directives are supported by this syntax module (if the
CPU- and output-module allow it). Note that all directives, besides the
equals-character, may be optionally preceded by a dot (.).
<symbol> = <expression>Equivalent to <symbol> equ <expression>.
<symbol> == <expression>Equivalent to <symbol> equ <expression>, but declare <symbol>
as externally visible.
abs [<expression>]Equivaluent to offset for compatibility with older Madmac
versions. Note that abs is not available for the jagrisc
cpu backend as it conflicts with an instruction name.
assert <expression>[,<expression>...]Assert that all conditions are true (non-zero), otherwise issue a warning.
bssThe following data (space definitions) are going into the BSS section. The BSS section cannot contain any initialized data.
dataThe following data are going into the data section, which usually contains pre-initialized data and no executable code.
dc <exp1>[,<exp2>...]Equivalent to dc.w.
dc.b <exp1>[,<exp2>,"<string1>",'<string2>'...]Assign the integer or string constant operands into successive bytes of memory in the current section. Any combination of integer and character string constant operands is permitted.
dc.i <exp1>[,<exp2>...]Assign the values of the operands into successive 32-bit words
of memory in the current section. In contrast to dc.l the
high and low half-words will be swapped as with the Jaguar-RISC
movei instruction.
dc.l <exp1>[,<exp2>...]Assign the values of the operands into successive 32-bit words of memory in the current section.
dc.w <exp1>[,<exp2>...]Assign the values of the operands into successive 16-bit words of memory in the current section.
dcbEquivalent to dcb.w.
dcb.b <exp>[,<fill>]Insert <exp> zero or <fill> bytes into the current section.
dcb.l <exp>[,<fill>]Insert <exp> zero or <fill> 32-bit words into the current section.
dcb.w <exp>[,<fill>]Insert <exp> zero or <fill> 16-bit words into the current section.
dphraseAlign the program counter to the next integral double phrase boundary (16 bytes).
ds <exp>Equivalent to dcb.w <exp>,0.
ds.b <exp>Equivalent to dcb.b <exp>,0.
ds.l <exp>Equivalent to dcb.l <exp>,0.
ds.w <exp>Equivalent to dcb.w <exp>,0.
elseElse-part of a conditional-assembly block. Refer to ’if’.
endEnd the assembly of the current file. Parsing of an include file is terminated here and assembling of the parent source commences. It also works to break the current conditional block, repetition or macro.
endifEnds a block of conditional assembly.
endmEnds a macro definition.
endrEnds a repetition block.
<symbol> equ <expression>Define a new program symbol with the name <symbol> and assign to it the value of <expression>. Defining <symbol> twice will cause an error.
evenAlign the program counter to an even value, by inserting a zero-byte when it is odd.
exitmExit the current macro (proceed to endm) at this point and
continue assembling the parent context.
Note that this directive also resets the level of conditional
assembly to a state before the macro was invoked (which means that
it works as a ’break’ command on all new if directives).
extern <symbol>[,<symbol>...]Declare the given symbols as externally defined. Internally there is
no difference to globl, as both declare the symbols, no
matter if defined or not, as externally visible.
globl <symbol>[,<symbol>...]Declare the given symbols as externally visible in the object file
for the linker. Note that you can have the same effect by using
a double-colon (::) on labels or a double-equal (==)
on equate-symbols.
if <expression>Start of block of conditional assembly. If <expression> is true, the
block between ’if’ and the matching ’endif’ or
’else’ will be assembled. When false, ignore all lines until
an ’else’ or ’endif’ directive is encountered.
It is possible to leave such a block early from within an include
file (with end) or a macro (with endm).
iif <expression>, <statement>A single-line conditional assembly. The <statement> will be parsed when <expression> evaluates to true (non-zero). <statement> may be a normal source line, including labels, operators and operands.
incbin "<file>"Inserts the binary contents of <file> into the object code at this position.
include "<file>"Include source text of <file> at this position.
listThe following lines will appear in the listing file, if it was requested.
longAlign the program counter to the next integral longword boundary (4 bytes), by inserting as many zero-bytes as needed.
macro <name> [<argname>[,<argname>...]]Defines a macro which can be referenced by <name> (case-sensitive).
The macro definition is terminated by an endm directive
and may be exited by exitm.
When calling a macro you may pass up to 64 arguments, separated by
comma. The first ten arguments are referenced within the macro
context as \1 to \9 and \0 for the tenth.
Optionally you can specify a list of argument names, which are
referenced with a leading backslash character (\) within the macro.
The special code \~ inserts a unique id, useful for
defining labels. \# is replaced by the number of arguments.
\! writes the the size-qualifier (M68k) including the dot.
\?argname expands to 1 when the named argument is
specified and non-empty, otherwise it expands to 0.
It is also allowed to enclose argument names in curly braces, which
is useful in situations where the argument name is followed by
another valid identifier character.
macundef <name>[,<name>...]Undefine one or more already defined macros, making them unknown for the following source to assemble.
nlistThe following lines will not be visible in a listing file.
nolistThe following lines will not be visible in a listing file.
offset [<expression>]Switches to a special offset-section. The contents of such a section
is not included in the output. Their labels may be referenced as
absolute offset symbols. Can be used to define structure offsets.
The optional <expression> gives the start offset for this
section. Defaults to zero when omitted.
<expression> must evaluate as a constant!
org <expression>Sets the base address for the subsequent code and switch into
absolute mode. Such a block is terminated by any section directive
or by .68000.
phraseAlign the program counter to the next integral phrase boundary (8 bytes).
print <expression>[,<expression>...]Prints strings and formatted expressions to the assembler’s console. <expression> is either a string in quotes or an expression, which is optionally preceded by special format flags:
Several flags can be used to format the output of expressions. The default is a 16-bit signed decimal.
/xhexadecimal
/dsigned decimal
/uunsigned decimal
/w16-bit word
/l32-bit longword
For example:
.print "Value: ", /d/l xyz
qphraseAlign the program counter to the next integral quad phrase boundary (32 bytes).
rept <expression>The block between rept and endr will be repeated
<expression> times, which has to be positive.
<symbol> set <expression>Create a new symbol with the name <symbol> and assign the value of <expression>. If <symbol> is already assigned, it will contain a new value from now on.
textThe following code and data is going into the text section, which usually is the first program section, containing the executable code.
Some known problems of this module at the moment:
[]) are currently not supported to
prioritize terms, as an alternative for parentheses.
^^func) are currently not supported.
This module has the following error messages:
Next: Test output module, Previous: Madmac Syntax Module [Contents]
This chapter describes the oldstyle syntax module suitable
for some 8-bit CPUs (6502, 680x, 68HC1x, Z80, etc.),
which is available with the extension oldstyle.
This module is written in 2002-2023 by Frank Wille and is covered by the vasm copyright without modifications.
This syntax module provides the following additional options:
Automatically export all non-local symbols, making them visible to other modules during linking.
Allow the asterisk (*) for starting comments in the first
column. This disables the possibility to set the code origin with
*=addr in the first column.
Directives have to be preceded by a dot (.).
Ignore everything after a blank in the operand field and treat it as a comment. This option is only available when the backend does not separate its operands with blanks as well.
Allow dots (.) within all identifiers.
Disable C-style constant prefixes.
Disable intel-style constant suffixes.
Enables the additional section directives text, data and
bss, which switch to their respective section type. The original
text directive for creating string-constants and the data
directive for creating byte-constants are no longer available. But there
are still other directives for the same purpose.
Labels always start at the first column and may be terminated by a
colon (:), but don’t need to. In the latter case the mnemonic
has to be separated from the label by whitespace (not required in
any case, e.g. with =).
Local labels are preceded by ’.’ or terminated by ’$’.
For the rest, any alphanumeric character including ’_’ is allowed.
Local labels are valid between two global label definitions.
It is allowed, but not recommended, to refer to any local symbol starting with
’.’ in the source, by preceding its name with the name of the last
previously defined global symbol: global_name.local_name.
Anonymous labels are supported by defining them with a single ’:’
at the beginning of a line. They may be referenced by ’:’ followed
directly by one or more ’+’ or ’-’ signs. A + selects
the first anonymous label following the point of reference. A ++
select the second anonymous label in that direction, and so on. A -
selects the first anonymous label before the point of reference. Example:
: jmp :- ;infinite loop
The option -ldots allows dots (.) within labels and other
identifiers, but disables the above mentioned feature.
The operands are separated from the mnemonic by whitespace. Multiple
operands are separated by comma (,).
Make sure that you don’t define a label on the same line as a directive for conditional assembly (if, else, endif)! This is not supported.
Some CPU backends may support multiple statements (directives or
mnemonics) per line, separated by a special character (e.g. : for Z80).
Comments are introduced by the comment character (;), or the first
blank following the operand field when option -i was given.
The rest of the line will be ignored.
Example:
mylabel instr op1,op2 ;comment
In expressions, numbers starting with $ are hexadecimal (e.g.
$fb2c). For Z80 also & may be used as a hexadecimal prefix,
but make sure to avoid conflicts with the and-operator (either by using
parentheses or blanks).
% introduces binary numbers (e.g. %1100101).
Numbers starting with @ are assumed to be octal numbers, e.g.
@237 (except for Z80, where it means binary).
A special case is a digit followed by a #, which can be used to
define an arbitrary base between 2 and 9 (e.g. 4#3012).
Intel-style constant suffixes are supported: h for hexadecimal,
d for decimal, o or q for octal and b for
binary. Hexadecimal intel-style constants must start with a digit (prepend
0, when required).
Also C-style prefixes are supported for hexadecimal (0x) and
binary (0b).
All other numbers starting with a digit are decimal, e.g. 1239.
The following directives are supported by this syntax module (if the CPU- and output-module allow it):
<symbol> = <expression>Equivalent to <symbol> equ <expression>.
abyte <modifier>,<exp1>[,<exp2>,"<string1>"...]Write the integer or string constants into successive bytes
of memory in the current section while modifying each expression
(and string-character) by the modifier expression.
When the modifier contains the special ._ symbol, then it
is a placeholder for any expression from the line. Otherwise the
modifier will be just added to each element.
Any combination of integer and character string constants is permitted.
addr <exp1>[,<exp2>...]Assign the values of the operands into successive words of memory in the current section, using the target’s endianness and address size.
align <bitcount>Insert as much zero bytes as required to reach an address where
<bit_count> low order bits are zero. For example align 2 would
make an alignment to the next 32-bit boundary.
asc <exp1>[,<exp2>,"<string1>"...]Equivalent to byte <exp1>[,<exp2>,"<string1>"...].
ascii <exp1>[,<exp2>,"<string1>"...]See defm.
asciiz "<string1>"[,"<string2>"...]See string.
assert <expression>[,<message>]Display an error with the optional <message> when the expression is false.
binary <file>Inserts the binary contents of <file> into the object code at this position.
blk <exp>[,<fill>]Insert <exp> zero or <fill> bytes into the current section.
blkw <exp>[,<fill>]Insert <exp> zero or <fill> 16-bit words into the current section, using the endianness of the target CPU.
bssWith option -sect:
switches to a bss section with attributes "aurw".
bsz <exp>[,<fill>]Equivalent to blk <exp>[,<fill>].
byt <exp1>[,<exp2>,"<string1>"...]Assign the integer or string constant operands into successive bytes of memory in the current section. Any combination of integer and character string constant operands is permitted. Without any operands the program counter is just icremented by one.
byte <exp1>[,<exp2>,"<string1>"...]Equivalent to byt <exp1>[,<exp2>,"<string1>"...].
data <exp1>[,<exp2>,"<string1>"...]Equivalent to byt <exp1>[,<exp2>,"<string1>"...].
(Not available with option -sect.)
dataWith option -sect:
switches to a data section with attributes "adrw".
db <exp1>[,<exp2>,"<string1>"...]Equivalent to byt <exp1>[,<exp2>,"<string1>"...].
dc <exp>[,<fill>]Equivalent to blk <exp>[,<fill>].
defb <exp1>[,<exp2>,"<string1>"...]Equivalent to byte <exp1>[,<exp2>,"<string1>"...].
defc <symbol> = <expression>Define a new program symbol with the name <symbol> and assign to it the value of <expression>. Defining <symbol> twice will cause an error.
defl <exp1>[,<exp2>...]Assign the values of the operands into successive 32-bit integers of memory in the current section, using the endianness of the target CPU.
defp <exp1>[,<exp2>...]Assign the values of the operands into successive 24-bit integers of memory in the current section, using the endianness of the target CPU.
defm "string"Equivalent to text "string".
defw <exp1>[,<exp2>...]Equivalent to word <exp1>[,<exp2>...].
dfb <exp1>[,<exp2>,"<string1>"...]Equivalent to byte <exp1>[,<exp2>,"<string1>"...].
dfw <exp1>[,<exp2>...]Equivalent to word <exp1>[,<exp2>...].
defs <exp>[,<fill>]Equivalent to blk <exp>[,<fill>].
dendEnds an offset-section started by dsect and restores the
previously active section.
dephaseEquivalent to rend.
ds <exp>[,<fill>]Equivalent to blk <exp>[,<fill>].
dsb <exp>[,<fill>]Equivalent to blk <exp>[,<fill>].
dsectStarts an ’offset-section’ (the original directive in ADE was called
’dummy-section’) which does not generate any code in the output file.
Its only purpose is to define absolute labels. Within a dsect
block you may use org directives to set a new offset, which
defaults to zero for the first dsect otherwise. Following
dsect sections continue with the last offset from the former.
Such an offsect-section block is closed by the dend directive,
which restores the previous ’real’ section.
dsw <exp>[,<fill>]Equivalent to blkw <exp>[,<fill>].
dw <exp1>[,<exp2>...]Equivalent to word <exp1>[,<exp2>...].
endAssembly will terminate behind this line.
endifEnds a section of conditional assembly.
elEquivalent to else.
elseAssemble the following lines when the previous if-condition
was false.
eiEquivalent to endif. (Not available for Z80 CPU)
einlineEnd a block of isolated local labels, started by inline.
endmEnds a macro definition.
endmacEnds a macro definition.
endmacroEnds a macro definition.
endrEnds a repetition block.
endrepEnds a repetition block.
endrepeatEnds a repetition block.
endstructEnds a structure definition.
endstructureEnds a structure definition.
<symbol> eq <expression>Equivalent to <symbol> equ <expression>.
<symbol> equ <expression>Define a new program symbol with the name <symbol> and assign to it the value of <expression>. Defining <symbol> twice will cause an error.
extern <symbol>[,<symbol>...]See global.
evenAligns to an even address. Equivalent to align 1.
fail <message>Show an error message including the <message> string. Do not generate an output file.
fiEquivalent to endif.
fill <exp>Equivalent to blk <exp>,0.
fcb <exp1>[,<exp2>,"<string1>"...]Equivalent to byte <exp1>[,<exp2>,"<string1>"...].
fcc "<string>"Equivalent to text.
fcs "<string>"Works like text and fcc, but additionally sets the
most significant bit of the last byte. This can be used as a
string terminator on some systems.
fdb <exp1>[,<exp2>,"<string1>"...]Equivalent to word <exp1>[,<exp2>...].
global <symbol>[,<symbol>...]Flag <symbol> as an external symbol, which means that <symbol> is visible to all modules in the linking process. It may be either defined or undefined.
if <expression>Conditionally assemble the following lines if <expression> is non-zero.
ifdef <symbol>Conditionally assemble the following lines if <symbol> is defined.
ifndef <symbol>Conditionally assemble the following lines if <symbol> is undefined.
ifd <symbol>Conditionally assemble the following lines if <symbol> is defined.
ifnd <symbol>Conditionally assemble the following lines if <symbol> is undefined.
ifeq <expression>Conditionally assemble the following lines if <expression> is zero.
ifne <expression>Conditionally assemble the following lines if <expression> is non-zero.
ifgt <expression>Conditionally assemble the following lines if <expression> is greater than zero.
ifge <expression>Conditionally assemble the following lines if <expression> is greater than zero or equal.
iflt <expression>Conditionally assemble the following lines if <expression> is less than zero.
ifle <expression>Conditionally assemble the following lines if <expression> is less than zero or equal.
ifused <symbol>Conditionally assemble the following lines if <symbol> has been
previously referenced in an expression or in a parameter of an opcode.
Issue a warning, when <symbol> is already defined.
Note that ifused does not work, when the symbol has only been
used in the following lines of the source.
incbin <file>[,<offset>[,<nbytes>]]Inserts the binary contents of <file> into the object code at this position. When <offset> is specified, then the given number of bytes will be skipped at the beginning of the file. The optional <nbytes> argument specifies the maximum number of bytes to be read from that file.
incdir <path>Add another path to search for include files to the list of known paths. Paths defined with -I on the command line are searched first.
include <file>Include source text of <file> at this position.
inlineLocal labels in the following block are isolated from previous
local labels and those after einline.
mac <name>Equivalent to macro <name>.
listThe following lines will appear in the listing file, if it was requested.
local <symbol>[,<symbol>...]Flag <symbol> as a local symbol, which means that <symbol> is local for the current file and invisible to other modules in the linking process.
macro <name>[,<argname>...]Defines a macro which can be referenced by <name>. The <name>
may also appear at the left side of the macro directive,
starting on the first column. The macro definition is closed
by an endm directive. When calling a macro you may pass
up to 9 arguments, separated by comma. These arguments are
referenced within the macro context as \1 to \9,
or optionally by named arguments, which you have to specify in
the operand.
Argument \0 is set to the macro’s first qualifier
(mnemonic extension), when given.
The special argument \@ inserts an underscore followed by
a six-digit unique id, useful for defining labels.
\() may be used as a separator between the name of a macro
argument and the subsequent text.
\<symbolname> inserts the current decimal value of the absolute
symbol symbolname.
mdat <file>Equivalent to incbin <file>.
needs <expression>Equivalent to symdepend <expression>.
nolistThe following lines will not be visible in a listing file.
org [#]<expression>Sets the base address for the subsequent code. This is equivalent
to *=<expression>. An optional # is supported for
compatibility reasons.
phase <expression>Equivalent to rorg <expression>.
repeat <expression>Equivalent to rept <expression>.
rept <expression>Repeats the assembly of the block between rept and endr
<expression> number of times. <expression> has to be positive.
reserve <exp>Equivalent to blk <exp>,0.
rendEnds a rorg block of label relocation. Following labels will
be based on org again.
rmb <exp>[,<fill>]Equivalent to blk <exp>[,<fill>]. (Not available for 6502 CPU.)
roffs <expression>Sets the program counter <expression> bytes behind the start of the current section. The new program counter must not be smaller than the current one. The space will be padded with zeros.
rorg <expression>Relocate all labels between rorg and rend based on the
new origin from <expression>.
section <name>[,"<attributes>"]Starts a new section named <name> or reactivate an old one. If attributes are given for an already existing section, they must match exactly. The section’s name will also be defined as a new symbol, which represents the section’s start address. The "<attributes>" string may consist of the following characters:
Section Contents:
csection has code
dsection has initialized data
usection has uninitialized data
isection has directives (info section)
nsection can be discarded
Rremove section at link time
asection is allocated in memory
Section Protection:
rsection is readable
wsection is writable
xsection is executable
ssection is shareable
When attributes are missing they are automatically set for the section
names text, data, rodata, bss,
.text, .data, .rodata and .bss.
Otherwise they default to "acrwx".
<symbol> set <expression>Create a new symbol with the name <symbol> and assign the value of <expression>. If <symbol> is already assigned, it will contain a new value from now on.
spc <exp>Equivalent to blk <exp>,0.
str "<string1>"[,"<string2>"...]Like text, but adds a terminating carriage return (ASCII
code 13).
string "<string1>"[,"<string2>"...]Like text, but adds a terminating zero-byte.
struct <name>Defines a structure which can be referenced by <name>. Labels within
a structure definitation can be used as field offsets. They will be
defined as local labels of <name> and can be referenced through
<name>.<label>. All directives are allowed, but instructions will
be ignored when such a structure is used. Data definitions can be used as
default values when the structure is used as initializer. The structure
name, <name>, is defined as a global symbol with the structure’s size.
A structure definition is ended by endstruct.
structure <name>Equivalent to struct <name>.
symdepend <expression>Declare the current section being dependent on an externally defined
symbol from <expression>. In object file formats which
support it, this will generate an external symbol reference without
any actual relocation being performed (R_NONE in ELF).
text "<string>"Puts a single string constant into successive bytes of memory of the current section. The string delimiters may be any printable ASCII character. (Not available with option -sect.)
textWith option -sect:
switches to a code section with attributes "acrx".
weak <symbol>[,<symbol>...]Flag <symbol> as a weak symbol, which means that <symbol> is visible to all modules in the linking process and may be replaced by any global symbol with the same name. When a weak symbol remains undefined its value defaults to 0.
wor <exp1>[,<exp2>...]Assign the values of the operands into successive 16-bit words of memory in the current section, using the endianness of the target CPU. Without any operand just the program counter is incremented by two.
wrd <exp1>[,<exp2>...]Equivalent to wor <exp1>[,<exp2>...].
word <exp1>[,<exp2>...]Equivalent to wor <exp1>[,<exp2>...].
xdef <symbol>[,<symbol>...]See global.
xlib <symbol>[,<symbol>...]See global.
xref <symbol>[,<symbol>...]See global.
zmb <exp>[,<fill>]Equivalent to blk <exp>[,<fill>].
The oldstyle syntax is able to manage structures. Structures can be defined in two ways:
mylabel struct[ure]
<fields>
endstruct[ure]
or:
struct[ure] mylabel
<fields>
endstruct[ure]
Any directive is allowed to define the structure fields. Labels can be used to define offsets into the structure. The initialized data is used as default value, whenever no value is given for a field when the structure is referenced.
Some examples of structure declarations:
struct point x db 4 y db 5 z db 6 endstruct
This will create the following labels:
point.x ; 0 offsets point.y ; 1 point.z ; 2 point ; 3 size of the structure
The structure can be used by optionally redefining the field values:
point1 point point2 point 1, 2, 3 point3 point ,,4
is equivalent to
point1
db 4
db 5
db 6
point2
db 1
db 2
db 3
point3
db 4
db 5
db 4
Some known problems of this module at the moment:
org or to the current pc symbol '*'
(on the Z80 the pc symbol is '$') must be constant.
if directive must be constant.
This module has the following error messages:
Next: ELF output module, Previous: Oldstyle Syntax Module [Contents]
This chapter describes the test output module which can be selected with the -Ftest option.
This module is written in 2002 by Volker Barthelmann and is covered by the vasm copyright without modifications.
This output module provides no additional options.
This output module outputs a textual description of the contents of all sections. It is mainly intended for debugging.
None.
Some known problems of this module at the moment:
This module has the following error messages:
Next: a.out output module, Previous: Test output module [Contents]
This chapter describes the ELF output module which can be selected with the -Felf option.
This module is written in 2002-2016 by Frank Wille and is covered by the vasm copyright without modifications.
Do not delete empty sections without any symbol definition.
This output module outputs the ELF (Executable and Linkable Format)
format, which is a portable object file format and works for a variety
of 32- and 64-bit operating systems.
The ELF output format, as implemented in vasm, currently supports
the following architectures:
The supported relocation types depend on the selected architecture.
Some known problems of this module at the moment:
This module has the following error messages:
Next: TOS output module, Previous: ELF output module [Contents]
This chapter describes the a.out output module which can be selected with the -Faout option.
This module is written in 2008-2016,2020,2021 by Frank Wille and is covered by the vasm copyright without modifications.
Sets the MID field of the a.out header to the specified value. The MID defaults to 2 (Sun020 big-endian) for M68k and to 100 (PC386 little-endian) for x86.
This output module emits the a.out (assembler output)
format, which is an older 32-bit format for Unix-like operating systems,
originally invented by AT&T.
The a.out output format, as implemented in vasm, currently supports
the following architectures:
The following standard relocations are supported by default:
Standard relocation table entries occupy 8 bytes and don’t include an addend, so they are not suitable for most RISC CPUs. The extended relocations format occupies 12 bytes and also allows more relocation types.
Some known problems of this module at the moment:
This module has the following error messages:
Next: GST output module, Previous: a.out output module [Contents]
This chapter describes the TOS output module, which can be selected with option -Ftos to generate Atari TOS executable files, or with option -Fdri to generate DRI-format object files.
This module is written in 2009-2016,2020,2021,2023 by Frank Wille and is covered by the vasm copyright without modifications.
Use the SozobonX extension, which allows symbol names with unlimited length in DRI objects and executables. Overrides the HiSoft extension.
Do not write HiSoft extended symbol names. Cut at 8 characters.
These options are valid for the tos module only:
Write Devpac "MonST"-compatible symbols.
Sets the flags field in the TOS file header. Defaults to 0. Overwrites a TOS flags definition in the assembler source.
The TOS executable file format is used on Atari 16/32-bit computers with 68000 up to 68060 CPU running any TOS, MiNT or any compatible operating system. The symbol table is in DRI format and may use HiSoft (default) or SozobonX extended symbol names.
The object file format defined by Digital Research for Atari M68k systems.
tos all symbols must be defined, otherwise the generation
of the executable fails. Unknown symbols are listed by vasm.
tos are 32-bit absolute. For
dri all 16- and 32-bit absolute and PC-relative relocations
are supported. 16-bit base-relative appears as a 16-bit absolute symbol
reference.
tos
format increases the maximum length to 22 by using an extension
created by HiSoft, unless forbidden by -stdsymbols.
With -szbx you may enable the SozobonX extension for
unlimited length - but you need a linker which supports that format.
dri object files are limited to a
maximum of 8192 symbols.
All these restrictions are defined by the file format itself.
Some known problems of this module at the moment:
This module has the following error messages:
Next: Amiga output module, Previous: TOS output module [Contents]
This chapter describes the gst output module which can be selected with the -Fgst option.
This module is written in 2023 by Frank Wille and is covered by the vasm copyright without modifications.
None.
This module outputs the GST object file format by GST Software, which was used by several development tools on the Atari M68k computers. For example by the GST assembler and Devpac.
Some known problems of this module at the moment:
This module has the following error messages:
Next: X68k output module, Previous: GST output module [Contents]
This chapter describes the AmigaOS hunk-format output module which can be selected with the -Fhunk option to generate objects and with the -Fhunkexe option to generate executable files.
This module is written in 2002-2022 by Frank Wille and is covered by the vasm copyright without modifications.
Sets a two-byte code used for aligning a code hunk to the next 32-bit border. Defaults to 0x4e71 for M68k code sections, to allow linking of functions which extend over two object files. Otherwise it defaults to zero.
Do not delete empty sections without any symbol definition.
Use only those hunk types and external reference types which have been valid at the time of Kickstart 1.x for compatibility with old assembler sources and old linkers. For example: no longer differentiate between absolute and relative references. In executables it will prevent the assembler from using 16-bit relocation offsets in hunks and rejects 32-bit PC-relative relocations.
Automatically generate an SAS/C-compatible LINE DEBUG hunk for the input source. Overrides any line debugging directives from the source text.
These options are valid for the hunkexe module only:
Try to shorten sections in the output file by removing zero words without relocation from the end. This technique is only supported by AmigaOS 2.0 and higher.
This output module outputs the hunk object (standard for M68k
and extended for PowerPC) and hunkexe executable format, which
is a proprietary file format used by AmigaOS and WarpOS.
The hunkexe module will generate directly executable files, without
the need for another linker run. But you have to make sure that there are
no undefined symbols, common symbols, or unusual relocations (e.g. small
data) left.
It is allowed to define sections with the same name but different attributes. They will be regarded as different entities.
The hunk/hunkexe output format is only intended for M68k
and PowerPC cpu modules and will abort when used otherwise.
The hunk module supports the following relocation types:
The hunkexe module supports absolute 32-bit relocations only.
Some known problems of this module at the moment:
hunkexe module won’t process common symbols and allocate
them in a BSS section. Use a real linker for that.
This module has the following error messages:
Next: O65 output module, Previous: Amiga output module [Contents]
This chapter describes the Xfile output module which can be selected with the -Fxfile option.
This module is written in 2018,2020,2021 by Frank Wille and is covered by the vasm copyright without modifications.
None.
This module outputs the Xfile executable file format, which is used on Sharp X68000 16/32-bit computer with 68000 up to 68040 CPU.
.rdata or .stack sections
require a linker.
Some known problems of this module at the moment:
This module has the following error messages:
Next: vobj output module, Previous: X68k output module [Contents]
This chapter describes the O65 binary relocation format V1.3 for the 6502 family, as defined by Andre Fachat on 6502.org. Option -Fo65 outputs object files suitable for another linker pass, while -Fo65exe outputs executable files for an O65 loader. The difference is just a flag which declares the file being an object, and vasm will make sure that the load-addresses of sections in an executable will be consecutive and do not overlay.
This module is written in 2021 by Frank Wille and is covered by the vasm copyright without modifications.
Sets a start address for the bss section.
Sets a start address for the data section.
Enable informational header options, generated by the assembler: file name, assembler name and version, creation date.
Write author’s name to the header options.
Write file name to the header options. Overwrites the real file name, which would be set by -fopts.
Make the output file use paged alignment and simplified paged relocations.
Set minimum alignment for all sections as number of least significant
bits which have to be zero. align may be 0, 1, 2, 8.
The default behaviour is to use the maximum alignment given by the
input sections.
Store required stack size in the header.
Sets a start address for the text section.
Sets a start address for the zero (zero/direct page) section.
These options are valid for the o65exe module only:
Set a flag in the header which requests automatic clearing of the
bss section.
This output module outputs the o65 object file and o65exe
executable file format for 6502-family processors and the 65816.
The processor type is determined by the selected CPU of the active backend
and stored in the header.
The o65exe module generates executable files for a o65-loader,
which is present in some 6502 operating systems (e.g. Lunix, SMOS, OS/A65).
Unresolved symbols are allowed in o65 object- and executable-files.
In the latter case the o65-loader is responsible to resolve them.
Common symbols, weak symbol and most relocation types, except absolute
addresses, are not supported by o65.
The o65 format recognizes four different sections by their attributes or name:
acrx).
adrw).
aurw).
"bss" anywhere in
their name (aurw).
Up to two absolute sections (ORG directive) can be stored in
the text and data slots, in the order of occurrence.
Currently the o65/o65exe output module is only intended to
work with the 6502 cpu module and will abort when used otherwise.
It supports all relocation types defined by o65, which are:
Common or weak symbols are not supported.
Some known problems of this module at the moment:
This module has the following error messages:
Next: Simple binary output module, Previous: O65 output module [Contents]
This chapter describes the simple binary output module which can be selected with the -Fvobj option.
This module is written in 2002-2014 by Volker Barthelmann and is covered by the vasm copyright without modifications.
None.
This output module outputs the vobj object format, a simple
portable proprietary object file format of vasm.
As this format is not yet fixed, it is not described here.
None.
Some known problems of this module at the moment:
This module has the following error messages:
Next: Motorola srecord output module, Previous: vobj output module [Contents]
This chapter describes the simple binary output module which can be selected with the -Fbin option.
This module is written in 2002-2023 by Volker Barthelmann and Frank Wille and is covered by the vasm copyright without modifications.
Writes an Apple DOS 3.3 binary file header preceding the output file, which consists of a 16-bit start/load address and a 16-bit file length in little-endian order.
Writes an Atari DOS COM header preceding the output file. It has
a standard header (0xFFFF), which is followed by any
number of sections. Each section starts with two little-endian
words defining the address of the first and last byte in memory.
Writes a Commodore PRG header preceding the output file, which consists of two bytes in little-endian order, defining the load address of the program.
Writes a Tandy Color Computer machine language file, which has a header with load address and length for each section and is terminated by a trailer with the execution address.
Writes a Dragon DOS header preceding the output file, where the
file type is set to $02 for binary. The load address is
taken from the first section’s start address. This will also be
the execute-address, when not specified otherwise. Refer to
option -exec.
Use the given symbol <symbol> as entry point of the program,
for those output format headers which support it. Otherwise this
option will be silently ignored.
Omitting this option will usually define the execution address
to be the same as the load address.
Writes a simple, single-segment format for the 65816-based Foenix computers. The header defines the program’s load address, which is also the start address.
Write a multi-segment format for the 65816-based Foenix computers. The format is derived from binary format used by Western Design Center’s C compiler. Every segment is stored with load address and size, and there is also a start address defined.
Writes a machine code file header for Oric-1, Oric-Atmos and compatible systems. It includes the file type and name, as well as the first and last address of the program to load. Note, that the name defaults to the output file name, limited to 15 characters. A ".tap" extension will be removed automatically.
Same as -oric-mc, but sets the auto-execute flag in the header.
Set the start address for the default section, when no
section or org directive was given.
This output module outputs the contents of all sections as simple binary data, by default without any header or additional information. When there are multiple sections, they must not overlap. Gaps between sections are filled with zero bytes, when not using a special header format, like Atari COM. Undefined symbols are not allowed.
Some known problems of this module at the moment:
This module has the following error messages:
Next: Intel hex output module, Previous: Simple binary output module [Contents]
This chapter describes the Motorola srecord output module which can be selected with the -Fsrec option.
This module is written in 2015 by Joseph Zatarski and is covered by the vasm copyright without modifications.
Enforce Carriage-Return and Line-Feed ("\r\n") line
endings. Default is to use the host’s line endings.
Use the given symbol <symbol> as entry point of the program.
This start address will be written into the trailer record,
which is otherwise zero.
When the symbol assignment is omitted, then the default symbol
start will be used.
Writes S1 data records and S9 trailers with 16-bit addresses.
Writes S2 data records and S8 trailers with 24-bit addresses.
Writes S3 data records and S7 trailers with 32-bit addresses. This is the default setting.
This output module outputs the contents of all sections in Motorola srecord
format, which is a simple ASCII output of hexadecimal digits. Each record
starts with ’S’ and a one-digit ID. It is followed by the data
and terminated by a checksum and a newline character.
Every section starts with a new header record.
Some known problems of this module at the moment:
This module has the following error messages:
Next: C #define output module, Previous: Motorola srecord output module [Contents]
This chapter describes the Intel Hex output module which can be selected with the -Fihex option.
This module is written in 2020 by Rida Dzhaafar and is covered by the vasm copyright without modifications.
Enforce Carriage-Return and Line-Feed ("\r\n") line
endings. Default is to use the host’s line endings.
Selects a format supporting 16-bit address space (default).
Selects a format supporting 20-bit address space.
Selects a format supporting 32-bit address space.
Sets the number of bytes per record to n.
Defaults to 32 bytes.
This output module outputs the contents of all sections in Intel hex format, which is a simple ASCII output of hexadecimal digits.
Some known problems of this module at the moment:
This module has the following error messages:
Next: Wozmon output module, Previous: Intel hex output module [Contents]
This chapter describes the C #define output module which can be selected with the -Fcdef option.
This module is written in 2020 by Volker Barthelmann and is covered by the vasm copyright without modifications.
There are currently no additional options for this output module.
This output module outputs the values of absolute symbols as a series of
#define directives that can be included in a C compiler. No code is generated.
Some known problems of this module at the moment:
This module has no error messages:
Next: m68k cpu module, Previous: C #define output module [Contents]
This chapter describes the wozmon output module which can be selected with the -Fwoz option.
This module is written in 2023 by anomie-p and is covered by the vasm copyright without modifications.
The author of this module may be contacted for bug reports:
There are no additonal options for this module.
This output module outputs the contents of all sections as wozmon monitor commands, which is a simple ASCII output of hexidecimal digits.
The output is suitable for an ascii transfer via serial connection to a system running wozmon. Character and/or line delays are likely to be necessary for a successful transfer.
The wozmon command parser converts up to sixteen bit hexidecimal values. An error containing the maximum out of range address is reported if a sixteen bit address space is exceeded.
Some known problems of this module at the moment:
This module has the following error messages:
Next: PowerPC cpu module, Previous: Wozmon output module [Contents]
This chapter documents the backend for the Motorola M68k/CPU32/ColdFire microprocessor family.
This module is written in 2002-2023 by Frank Wille and is covered by the vasm copyright without modifications.
Note, that the order on the command line may be important when specifying options. For example, if you specify -devpac compatibility mode behind enabling some optimization options, the Devpac-mode might disable these optimizations again.
This module provides the following additional options:
Generate code for the MC68000 CPU.
Generate code for the MC68008 CPU.
Generate code for the MC68010 CPU.
Generate code for the MC68020 CPU.
Generate code for the MC68030 CPU.
Generate code for the MC68040 CPU.
Generate code for the MC68060 CPU.
Generate code for the MC68020-68060 CPU. Be careful with
instructions like PFLUSHA, which exist on 68030 and 68040/060
with a different opcode (vasm will use the 040/060 version).
Generate code for the Apollo Core AC68080 CPU.
Generate code for the CPU32 family (MC6833x, MC6834x, etc.).
Generate code for a ColdFire family CPU. The following types are recognized: 5202, 5204, 5206, 520x, 5206e, 5207, 5208, 5210a, 5211a, 5212, 5213, 5214, 5216, 5224, 5225, 5232, 5233, 5234, 5235, 523x, 5249, 5250, 5253, 5270, 5271, 5272, 5274, 5275, 5280, 5281, 528x, 52221, 52553, 52230, 52231, 52232, 52233, 52234, 52235, 52252, 52254, 52255, 52256, 52258, 52259, 52274, 52277, 5307, 5327, 5328, 5329, 532x, 5372, 5373, 537x, 53011, 53012, 53013, 53014, 53015, 53016, 53017, 5301x, 5407, 5470, 5471, 5472, 5473, 5474, 5475, 547x, 5480, 5481, 5482, 5483, 5484, 5485, 548x, 54450, 54451, 54452, 54453, 5445x.
Generate code for the V2 ColdFire core. This option selects ISA_A (no hardware division or MAC), which is the most limited ISA supported by 5202, 5204 and 5206. All other ColdFire chips are backwards compatible to V2.
Generate code for the V3 ColdFire core. This option selects ISA_A+, hardware division MAC and EMAC instructions, which are supported by nearly all V3 CPUs, except the 5307.
Generate code for the V4 ColdFire core. This option selects ISA_B and MAC as supported by the 5407.
Generate code for the V4e ColdFire core. This option selects ISA_B, USP-, FPU-, MAC- and EMAC-instructions (no hardware division) as supported by all 547x and 548x CPUs.
Generate code for the MC68851 MMU. May be used in combination with another -m option.
Generate code for the MC68881 FPU. May be used in combination with another -m option.
Generate code for the MC68882 FPU. May be used in combination with another -m option.
Ignore any FPU options or directives, which has the effect that no 68881/2 FPU instructions will be accepted. This option can override the default of -gas to enable the FPU.
Disable all optimizations. Can be seen as a main switch to ignore all other optimization options on the command line and in the source.
When specified the assembler will also try to optimize branch instructions which already have a valid size extension. This option is automatically enabled in -phxass mode.
Translate relative branch instructions, whose destination is in a different section, into absolute jump instructions.
Enables optimization from MOVE #0,<ea> into CLR <ea>
for the MC68000. Note that CLR will execute a read-modify-write
cycle on the 68000, so it is disabled by default. With 68010 and
higher this is a generic standard optimization.
Unsigned immediate divisors, which are a power of two (from 2 to 256),
are optimized to shifts. Divisions by 1 are replaced by TST.L Dn
(32-bit) or MVZ.W Dn,Dn (16-bit, ColdFire only). Divisions by
-1 are replaced by NEG.L Dn (32-bit) or by a combination of
NEG.W Dn and MVZ.W Dn,Dn (16-bit, ColdFire only).
This optimization will leave the flags in a different state as
can normally be expected after a division instruction.
Floating point constants are loaded with the lowest precision
possible. This means that FMOVE.D #1.0,FP0 would be
optimized to FMOVE.S #1.0,FP0, because it is faster and
shorter at the same precision. The optimization will be performed
on all FPU instructions with immediate addressing mode.
When an FDIV-family instruction (FSDIV, FDDIV,
FSGLDIV) is detected it will additionally be checked if the
immediate constant is a power of 2 and then converted into
FMUL #1/c,FPn.
JMP and JSR instructions to external labels will be
converted into BRA.L and BSR.L, when the selected
CPU is 68020 or higher (or CPU32).
Allows optimization of LSL #1 into ADD. It is also
needed to optimize ASL #2 and LSL #2 into two ADD
instructions (together with -opt-speed).
These optimizations may modify the V-flag, which might not be intended.
Enables optimization from MOVEM <ea>,Rn into
MOVE <ea>,Rn (or the other way around). May also optimize
MOVEM with two registers into two separate MOVE
insructions, when advantageous for the currently selected CPU.
This optimization will modify the flags when the destination is
no address register.
Immediate multplication factors, which are a power of two (from 2
to 256), are optimized to shifts. Multiplications with zero are
replaced by a MOVEQ #0,Dn, with -1 are replaced by a
NEG.L Dn and with 1 by EXT.L Dn or TST.L Dn
(long-form). Not all optimizations are available for all cpu types
(e.g. MULU.W can only be optimized on ColdFire by using
the MVZ.W instruction).
This optimization will leave the flags in a different state as
can normally be expected after a multiplication instruction, and
the size of the optimized code may be bigger than before in some
situations (e.g. MULS.W #4,Dn). The latter will additionally
require the -opt-speed flag.
Optimizes MOVE.L #x,Dn into a combination of MOVEQ
and NEG.W, which works for ranges from
$ff81<=x<=$ffff and $ffff0001<=x<=$ffff0080.
Note that this optimization flips the N-flag!
Enables optimization from MOVE #x,-(SP) into PEA x.
This optimization will leave the flags unmodified, which might
not be intended.
Optimize for speed, even if this would increase code size.
For example it enables optimization of ASL.W #2,Dn into two
ADD.W Dn,Dn instructions. Or MULS.W #-4,Dn into
EXT.L Dn + ASL.L #2,Dn + NEG.L Dn.
Optimize for size, even if this would make the code slower.
This enables for example optimization of MOVE.L #x,Dn
into MOVEQ #x>>n,Dn + LSL.W #n,Dn. It is mostly used
together with other optimization flags.
Enables optimization from MOVE.B #-1,<ea> into ST <ea>.
This optimization will leave the flags unmodified, which might
not be intended.
Small code model.
All JMP and JSR instructions to external labels
will be converted into 16-bit PC-relative jumps.
References to absolute symbols in a
small data section (named "__MERGED") are optimized into a
base-relative addressing mode using the current base register set
by an active NEAR directive.
This option is automatically enabled in -phxass mode.
Print all critical optimizations which have side effects. Among those are -opt-lsl, -opt-mul, -opt-st, -opt-pea, -opt-movem and -opt-clr.
Print all optimizations and translations vasm is doing
(same as opt ow+).
In its default setting (no -devpac or -phxass option) vasm performs the following optimizations:
(0,An) to (An), etc).
Brackets ('[' and ']') in an operand are automatically
converted into parentheses ('(' and ')') as long as
the CPU is 68000 or 68010. This is a compatibility option for some
old assemblers.
All options are initially set to be Devpac compatible. Which means
that all optimizations are disabled, no debugging symbols will be
written and vasm will warn about any optimization being done.
When symbol output is enabled by opt d+, then the TOS symbol
table defaults to standard DRI format (limited to 8 characters).
Shift-right operations are performed using an unsigned 32-bit value.
Other options are the same as vasm’s defaults.
The symbol __G2 is defined, which contains information
about the selected cpu type.
The symbol __LK reflects the type of output file generated.
Which is 0 for TOS executables, 1 for DRI objects, 2 for GST objects,
3 for AmigaDOS objects and 4 for AmigaDOS executables.
All other formats are represented by 99, as they are unknown to Devpac.
It will also automatically enable -guess-ext and
-nodpc.
Register names are preceded by a ’%’ to prevent confusion with symbol names.
Recognize small data references in all 020+ extended addressing modes using 16-bit displacements on the base register. By default only the 16-bit address register displacement addressing mode can be used with small data (for compatibility reasons).
Enable additional GNU-as compatibility mnemonics, like
mov, movm and jra. Also accepts |
instead of ; for comments.
GNU-as compatibility mode selects the 68020 CPU and 68881/2 FPU
by default and enables -opt-jbra.
Accept illegal size extensions for an instruction, as long as the instruction is unsized or there is just a single size possible. This is the default setting in PhxAss and Devpac compatibility mode.
Prevents optimization of JMP/JSR to 32-bit PC-relative (BRA/BSR).
Do not attempt to encode absolute PC-displacements directly.
Example: 10(PC)
Do not check the size and type of expressions (OPT t-).
Example: dc.b 300
PhxAss-compatibilty mode. The "current PC symbol" (e.g. * in
mot-syntax module) is set to the instruction’s address + 2 whenever
an instruction is parsed.
According to the current cpu setting the symbols __CPU,
__FPU and __MMU are defined.
JMP/JSR (label,PC) will never be optimized (into a branch,
for example).
It will also automatically enable -opt-allbra,
-sd and -guess-ext.
Values which are out of range usually produce an error. With this option the errors 2026, 2030, 2033 and 2037 will be displayed as a warning, allowing the user to create an object file.
Allow redefining register symbols with EQUR. This should
only be used for compatibility with old sources. Not many assemblers
support that.
Set the small data base register to An. <n> is valid
between 0 and 6.
Additionally allow immediate operands to be prefixed by
& instead of just by #. This syntax was used by
the SGS assembler.
This backend accepts M68k and CPU32 instructions as described in Mototola’s M68000 family Programmer’s Reference Manual. Additionally it supports ColdFire instructions as described in Motorola’s ColdFire Microprocessor Family Programmer’s Reference Manual.
The syntax for the scale factor in ColdFire MAC instructions is
<< for left- and >> for right-shift. The scale factor may be
appended as an optional operand, when needed.
Example: mac d0.l,d1.u,<<.
The mask flag in MAC instructions is written as & and is
appended directly to the effective address operand. Example:
mac d0,d1,(a0)&,d2.
The target address type is 32bit. Floating point constants in instructions and data are supported and encoded in IEEE format.
Default alignment for instructions is 2 bytes. The default alignment for data is 2 bytes, when the data size is larger than 8 bits.
Depending on the selected cpu type the __VASM symbol will have
a value defined by the following bits:
bit 0MC68000 instruction set. Also used by MC6830x, MC68322, MC68356.
bit 1MC68010 instruction set.
bit 2MC68020 instruction set.
bit 3MC68030 instruction set.
bit 4MC68040 instruction set.
bit 5MC68060 instruction set.
bit 6MC68881 or MC68882 FPU.
bit 7MC68851 PMMU.
bit 8CPU32. Any MC6833x or MC6834x CPU.
bit 9ColdFire ISA_A.
bit 10ColdFire ISA_A+.
bit 11ColdFire ISA_B.
bit 12ColdFire ISA_C.
bit 13ColdFire hardware division support.
bit 14ColdFire MAC instructions.
bit 15ColdFire enhanced MAC instructions.
bit 16ColdFire USP register.
bit 17ColdFire FPU instructions.
bit 18ColdFire MMU instructions.
bit 20Apollo Core AC68080 instruction set.
The following symbols are defined for compatibility with other assemblers, so their function is not described here.
__G2, __LK
__CPU, __FPU, __MMU, __OPTC
_MOVEMBYTES, __MOVEMREGS
This backend extends the selected syntax module by the following directives:
.sdreg <An>Equivalent to near <An>.
basereg <expression>,<An>Starts a block of base-relative addressing through register An
(remember that A7 is not allowed as a base register).
The developer has to make sure that <expression> is placed
into An first, while the assembler automatically subtracts
<expression>, which is usually a program label with an optional offset,
from each displacement in a (d,An) addressing mode.
basereg has priority over the near directive. Its effect
can be suspended with the endb directive.
It is allowed to use several base registers in parallel.
cpu32Generate code for the CPU32 family.
endb <An>Ends a basereg block and suspends its effect onto the
specified base register An. It may be reused with a different
base expression thereafter (refer to basereg).
farDisables small data (base-relative) mode. All data references will be absolute.
fpu <cpID>Enables 68881/68882 FPU code generation. The <cpID> is inserted into the FPU instructions to select the correct coprocessor. Note that <cpID> is always 1 for the on-chip FPUs in the 68040 and 68060. A <cpID> of zero will disable FPU code generation.
initnearInitializes the selected small data base register. In contrast to
PhxAss, where this directive comes from, just a reference to
_LinkerDB is generated, which has to be resolved by a linker.
machine <cpu_type>Makes the assembler generate code for <cpu_type>, which can be
the following: 68000, 68010, 68020, 68030,
68040, 68060, 68080,
68851, 68881, 68882, cpu32.
And various ColdFire CPUs, starting with 5....
mc68000Generate code for the MC68000 CPU.
mc68010Generate code for the MC68010 CPU.
mc68020Generate code for the MC68020 CPU.
mc68030Generate code for the MC68030 CPU.
mc68040Generate code for the MC68040 CPU.
mc68060Generate code for the MC68060 CPU.
ac68080Generate code for the Apollo Core AC68080 FPGA CPU.
mcf5...Generate code for a ColdFire CPU. The recognized models are listed in the assembler-options section.
near [<An>]Enables small data (base-relative) mode and sets the base register
to An. near without an argument will reactivate a
previously defined small data mode, which might have been switched off
by a far directive.
near codeAll JMP and JSR instructions to external labels
will be converted into 16-bit PC-relative jumps. The small code
mode can be switched off by a far directive.
opt <option>[,<option>...]Sets Devpac-compatible options. When option -phxass is
given, then it will parse PhxAss options instead (which is discouraged
for new code, so there is no detailed description here).
Most supported Devpac2-style options are always suffixed by a
+ or - to enable or disable the option:
aAutomatically optimize absolute to PC-relative references. Default is off in Devpac-comptability mode, otherwise on.
cCase-sensitivity for all symbols, instructions and macros. Default is on.
dInclude all symbols for debugging in the output file. May also generate line debugging information in some output formats. Default is off in Devpac-comptability mode, otherwise on.
lGenerate a linkable object file. The default is defined by the selected output format via the assembler’s -F option. This option was supported by Devpac-Amiga only.
oEnable all optimizations (o1 to o12), or disable all optimizations.
The default is that all are disabled in Devpac-compatibility mode
and enabled otherwise.
When running in native vasm mode this option will also enable
PC-relative (opt a) and
the following safe vasm-specific optimizations (see below):
og, of.
o1Optimize branches without an explicit size extension.
o2Standard displacement optimizations (e.g. (0,An) -> (An)).
o3Optimize absolute addresses to short words.
o4Optimize move.l to moveq.
o5Optimize add #x and sub #x into their quick forms.
o6No effect in vasm.
o7Convert bra.b to nop, when branching to the next
instruction.
o8Optimize 68020+ base displacements to 16 bit.
o9Optimize 68020+ outer displacements to 16 bit.
o10Optimize add/sub #x,An to lea.
o11Optimize lea (d,An),An to addq/subq.
o12Optimize <op>.l #x,An to <op>.w #x,An.
owShow all optimizations being performed. Default is on in Devpac-compatibility mode, otherwise off.
pCheck if code is position independent. This will cause an error on each relocation being required. Default is off.
sInclude symbols in listing file. Default is on.
tCheck size and type of all expressions. Default is on.
wShow assembler warnings. Default is on.
xFor Amiga hunk format objects x+ strips local symbols from
the symbol table (symbols without xdef).
For Atari TOS executables this will enable the extended (HiSoft)
DRI symbol table format, which allows symbols with up to 22
characters. DRI standard only supports 8 characters.
Devpac options without +/- suffix:
l<n>Sets the output format (Devpac Atari only). Currently without effect.
p=<type>[/<type>]Sets the CPU type to any model vasm supports (original Devpac only allowed 68000-68040, 68332, 68881, 68882 and 68851).
Also the following Devpac3-style options are supported:
autopcCorresponds to a+.
caseCorresponds to c+.
chkpcCorresponds to p+.
debugCorresponds to d+.
symtabCorresponds to s+.
typeCorresponds to t+.
warnCorresponds to w+.
xdebugCorresponds to x+.
noautopcCorresponds to a-.
nocaseCorresponds to c-.
nochkpcCorresponds to p-.
nodebugCorresponds to d-.
nosymtabCorresponds to s-.
notypeCorresponds to t-.
nowarnCorresponds to w-.
noxdebugCorresponds to x-.
The following options are vasm specific and should not be used when
writing portable source. Using opt o+ or opt o- in
Devpac mode only toggles og and of.
obConvert absolute jumps to external labels into long-branches (refer to -opt-jbra).
ocEnable optimizations to CLR (refer to -opt-clr).
odEnable optimization of divisions into shifts (refer to -opt-div).
ofEnable immediate float constant optimizations (refer to -opt-fconst).
ogEnable generic vasm optimizations. This includes all safe optimizations which cannot be controlled by another option.
ojEnable branch to jump translations (refer to -opt-brajmp).
olEnable shift optimizations to ADD (refer to -opt-lsl).
omEnable MOVEM optimizations (refer to -opt-movem).
onEnable small data optimizations. References to absolute symbols in a small data section (named "__MERGED") are optimized into a base-relative addressing mode (refer to -sd).
opEnable optimizations to PEA (refer to -opt-pea).
oqOptimizes MOVE.L into a combination of MOVEQ and
NEG.W (refer to -opt-nmoveq).
osOptimize for speed before optimizing for size (refer to -opt-speed).
otEnable optimizations to ST (refer to -opt-st).
oxEnable optimization of multiplications into shifts (refer to -opt-mul).
ozEnable optimization for size, even if the code becomes slower (refer to -opt-size).
The default state is ’off’ for all these vasm specific options,
except for of and og, which are ’on’.
The following directives are only available for the Motorola syntax module:
<symbol> equr <Rn>Define a new symbol named <symbol> and assign the data or
address register Rn, which can be used from now on in operands.
When 68080 code generation is enabled, also Bn base address
registers and En vector registers are allowed to be assigned.
Note that a register symbol must be defined before it can be
used!
<symbol> equrl <reglist>Equivalent to <symbol> reg <reglist>.
<symbol> fequr <FPn>Define a new symbol named <symbol> and assign the FPU register
FPn, which can be used from now on in operands.
Note that a register symbol must be defined before it can be
used!
<symbol> fequrl <reglist>Equivalent to <symbol> freg <reglist>.
<symbol> freg <reglist>Defines a new symbol named <symbol> and assign the FPU register
list <reglist> to it. Registers in a list must be separated
by a slash (/) and ranges or registers can be defined
by using a hyphen (-). Examples for valid FPU register
lists are: fp0-fp7, fp1-3/fp5/fp7, fpiar/fpcr.
<symbol> reg <reglist>Defines a new symbol named <symbol> and assign the register
list <reglist> to it. Registers in a list must be separated
by a slash (/) and ranges or registers can be defined
by using a hyphen (-). Examples for valid register lists
are: d0-d7/a0-a6, d3-6/a0/a1/a4-5.
This backend performs the following operand optimizations:
(0,An) optimized to (An).
(d16,An) translated to (bd32,An,ZDn.w), when d16 is not
between -32768 and 32767 and the selected CPU allows it (68020 up or
CPU32).
(d16,PC) translated to (bd32,PC,ZDn.w), when d16 is not
between -32768 and 32767 and the selected CPU allows it (68020 up or
CPU32).
(d8,An,Rn) translated to (bd,An,Rn), when d8 is not
between -128 and 127 and the selected CPU allows it (68020 up or
CPU32).
(d8,PC,Rn) translated to (bd,PC,Rn), when d8 is not
between -128 and 127 and the selected CPU allows it (68020 up or
CPU32).
<exp>.l optimized to <exp>.w, when <exp> is absolute
and between -32768 and 32767.
<exp>.w translated to <exp>.l, when <exp> is a program
label or absolute and not between -32768 and 32767.
(0,An,...) optimized to (An,...) (which means the base
displacement will be suppressed). This allows further optimization
to (An), when the index is suppressed.
(bd16,An,...) translated to (bd32,An,...), when bd16 is
not between -32768 and 32767.
(bd32,An,...) optimized to (bd16,An,...), when bd16 is
between -32768 and 32767.
(bd32,An,ZRn) optimized to (d16,An), when bd32 is
between -32768 and 32767, and the index is suppressed (zero-Rn).
(An,ZRn) optimized to (An), when the index is suppressed.
(0,PC,...) optimized to (PC,...) (which means the base
displacement will be suppressed).
(bd16,PC,...) translated to (bd32,PC,...), when bd16 is
not between -32768 and 32767.
(bd32,PC,...) optimized to (bd16,PC,...), when bd16 is
between -32768 and 32767.
(bd32,PC,ZRn) optimized to (d16,PC), when bd32 is
between -32768 and 32767, and the index is suppressed (zero-Rn).
([0,Rn,...],...) optimized to ([An,...],...) (which means the base
displacement will be suppressed).
([bd16,Rn,...],...) translated to ([bd32,An,...],...), when bd16
is not between -32768 and 32768.
([bd32,Rn,...],...) optimized to ([bd16,An,...],...), when bd32
is between -32768 and 32768.
([...],0) optimized to ([...]) (which means the outer displacement
will be suppressed).
([...],od16) translated to ([...],od32), when od16 is
not between -32768 and 32767.
([...],od32) translated to ([...],od16), when od32 is
between -32768 and 32767.
Note that an operand optimization will only take place when a displacement’s
size was not enforced by the developer through an explicit size
extension (e.g. (4.l,a0))!
This backend performs the following instruction optimizations and translations:
<op>.L #x,An optimized to <op>.W #x,An, when x is
between -32768 and 32767.
ADD.? #x,<ea> optimized to ADDQ.? #x,<ea>, when x is
between 1 and 8.
ADD.? #x,<ea> optimized to SUBQ.? #x,<ea>, when x is
between -1 and -8.
ADDA.? #0,An and SUBA.? #0,An will be deleted.
ADDA.? #x,An translated to LEA (x,An),An, when x is
between -32768 and 32767.
ANDI.L #$ff,Dn optimized to MVZ.B Dn,Dn,
for ColdFire ISA_B/C.
ANDI.L #$ffff,Dn optimized to MVZ.W Dn,Dn,
for ColdFire ISA_B/C.
ANDI.? #0,<ea> optimized to CLR.? <ea>, when allowed
by the option -opt-clr or a different CPU than the MC68000 was
selected.
ANDI.? #-1,<ea> optimized to TST.? <ea>.
ASL.? #1,Dn optimized to ADD.? Dn,Dn for 68000 and 68010.
ASL.? #2,Dn optimized into a sequence of two ADD.? Dn,Dn
for 68000 and 68010, when the operation size is either byte or word and
the options -opt-speed and -opt-lsl are given.
B<cc> <label> translated into a combination of
B!<cc> *+8 and JMP <label>, when <label> is not defined in the
same section (and option -opt-brajmp is given),
or outside the range of -32768 to 32767 bytes from the current address
when the selected CPU is not 68020 up, CPU32 or ColdFire ISA_B/C.
B<cc> <label> is automatically optimized to 8-bit, 16-bit or
32-bit (68020 up, CPU32, MCF5407 only), whatever fits best. When the
selected CPU doesn’t support 32-bit branches it will try to change the
conditional branch into a B<!cc> *+8 and JMP <label> sequence.
BRA <label> translated to JMP <label>, when <label> is
not defined in the same section (and option -opt-brajmp is given),
or outside the range of -32768 to 32767 bytes from the current address
when the selected CPU is not 68020 up, CPU32 or ColdFire ISA_B/C.
BSR <label> translated to JSR <label>, when <label> is
not defined in the same section (and option -opt-brajmp is given),
or outside the range of -32768 to 32767 bytes from the current address
when the selected CPU is not 68020 up, CPU32 or ColdFire ISA_B/C.
<cp>B<cc> <label> is automatically optimized to 16-bit or 32-bit,
whatever fits best. <cp> means coprocessor and is P for the PMMU
and F for the FPU.
CLR.L Dn optimized to MOVEQ #0,Dn.
CMP.? #0,<ea> optimized to TST.? <ea>. The selected CPU type
must be MC68020 up, ColdFire or CPU32 to support address register direct
as effective address (<ea>).
DIVS.W/DIVU.W #1,Dn optimized to MVZ.W Dn,Dn, for
ColdFire ISA_B/C (-opt-div).
DIVS.W #-1,Dn optimized to the sequence of NEG.W Dn and
MVZ.W Dn,Dn (-opt-div and -opt-speed).
DIVS.L/DIVU.L #1,Dn optimized to TST.L Dn
(-opt-div).
DIVS.L #-1,Dn optimized to NEG.L Dn
(-opt-div).
DIVU.L #2..256,Dn optimized to LSR.L #x,Dn
(-opt-div).
EORI.? #-1,<ea> optimized to NOT.? <ea>.
EORI.? #0,<ea> optimized to TST.? <ea>.
FMOVEM.? <reglist> is deleted when the register list was empty.
FxDIV.? #m,FPn optimized to FxMUL.? #1/m,FPn when m is
a power of 2 and option -opt-fconst is given.
JMP <label> optimized to BRA.? <label>, when <label> is defined
in the same section and in the range of -32768 to 32767 bytes from the
current address.
Note that JMP (<lab>,PC) is never optimized, with the intention
to preserve jump-tables.
JSR <label> optimized to BSR.? <label>, when <label> is defined
in the same section and in the range of -32768 to 32767 bytes from the
current address.
Note that JSR (<lab>,PC) is never optimized, with the intention
to preserve jump-tables.
LEA 0,An optimized to SUBA.L An,An.
LEA (0,An),An and LEA (An),An will be deleted.
LEA (d,An),An is optimized to ADDQ.L #d,An when d
is between 1 and 8 and to SUBQ.L #-d,An when d is between
-1 and -8.
LEA (d,Am),An will be translated into a combination of
MOVEA and ADDA.L for 68000 and 68010, when d is lower
than -32768 or higher than 32767. The MOVEA will be omitted when
Am and An are identical. Otherwise -opt-speed is
required.
LINK.L An,#x optimized to LINK.W An,#x, when x is
between -32768 and 32767.
LINK.W An,#x translated to LINK.L An,#x, when x is
not between -32768 and 32767 and selected CPU supports this instruction.
LSL.? #1,Dn optimized to ADD.? Dn,Dn for 68000 and 68010,
when option -opt-lsl is given.
LSL.? #2,Dn optimized into a sequence of two ADD.? Dn,Dn
for 68000 and 68010, when the operation size is either byte or word and
the options -opt-speed and -opt-lsl are given.
MOVE.? #0,<ea> optimized to CLR.? <ea>, when allowed by
the option -opt-clr or a different CPU than the MC68000 was
selected.
MOVE.? #x,-(SP) optimized to PEA x, when allowed by the
option -opt-pea. The move-size must not be byte (.b).
MOVE.B #-1,<ea> optimized to ST <ea>, when allowed by the
option -opt-st.
MOVE.L #x,Dn optimized to MOVEQ #x,Dn, when x is
between -128 and 127.
MOVE.L #x,Dn optimized to the sequence of MOVEQ #x>>1,Dn
and ADD.W Dn,Dn, when 128<=x<=254 and x is even.
MOVE.L #x,Dn optimized to the sequence of MOVEQ #x^$ff,Dn
and NOT.B Dn, when 128<=x<=255 and the option
-opt-nmoveq was set.
MOVE.L #x,Dn optimized to the sequence of MOVEQ #x>>16,Dn
and SWAP Dn, when $10000<=x<=$7f0000 or
$ff80ffff<=x<=$fffeffff.
MOVE.L #x,Dn optimized to the sequence of MOVEQ #y,Dn
and NEG.W Dn, when $ff81<=x<=$ffff or
$ffff0001<=x<=$ffff0080 and the option -opt-nmoveq was set.
MOVE.L #x,Dn optimized to the sequence of MOVEQ #x>>n,Dn
and LSL.W #n,Dn, when $0100<=x<=$7f00 and the LSB is zero
and -opt-size was set together with standard optimizations.
MOVE.L #x,<ea> optimized to MOV3Q #x,<ea>, for ColdFire
ISA_B and ISA_C, when x is -1 or between 1 and 7.
MOVEA.? #0,An optimized to SUBA.L An,An.
MOVEA.L #x,An optimized to MOVEA.W #x,An, when x is
between -32768 and 32767.
MOVEA.L #label,An optimized to LEA label,An, which could
allow further optimization to LEA label(PC),An.
MOVEM.? <reglist> is deleted, when the register list was empty.
MOVEM.? <ea>,An optimized to MOVE.? <ea>,An, when the
register list only contains a single address register.
MOVEM.? <ea>,Rn optimized to MOVE.? <ea>,Rn and
MOVEM.? Rn,<ea> optimized to MOVE.? Rn,<ea>, when allowed
by the option -opt-movem or when just loading an address register.
MOVEM.? <ea>,Rm/Rn and MOVEM.? Rm/Rn,<ea> are optimized
into a sequence of two MOVE instructions when advantageous for
the currently selected CPU.
For example, for 68000 and 68010 it is no advantage to optimize
MOVEM Rm/Rn,-(An), and addressing modes with displacements
or absolute addresses are optimized for 68040 only (may additionally
require opt-speed).
MULS.?/MULU.? #0,Dn optimized to MOVEQ #0,Dn
(-opt-mul).
MULS.?/MULU.? #1,Dn is deleted (-opt-mul).
MULS.W #-1,Dn optimized to the sequence EXT.L Dn and
NEG.L Dn (-opt-mul and -opt-speed).
MULS.L #-1,Dn optimized to NEG.L Dn (-opt-mul).
MULS.W #2..256,Dn optimized to the sequence EXT.L Dn and
ASL.L #x,Dn (-opt-mul and -opt-speed).
MULS.W #-2..-256,Dn optimized to the sequence EXT.L Dn,
ASL.L #x,Dn and NEG.L Dn (-opt-mul and -opt-speed).
MULS.L #2..256,Dn optimized to ASL.L #x,Dn
(-opt-mul).
MULS.L #-2..-256,Dn optimized to the sequence ASL.L #x,Dn
and NEG.L Dn (-opt-mul and -opt-speed).
MULU.W #2..256,Dn optimized to the sequence MVZ.W Dn,Dn and
ASL.L #x,Dn for ColdFire ISA_B/C (-opt-mul and -opt-speed).
MULU.L #2..256,Dn optimized to LSL.L #x,Dn
(-opt-mul).
MVZ.? #x,Dn and MVS.? #x,Dn are optimized to
MOVEQ #x,Dn.
ORI.? #0,<ea> optimized to TST.? <ea>.
SUB.? #x,<ea> optimized to SUBQ.? #x,<ea>, when x is
between 1 and 8.
SUB.? #x,<ea> optimized to ADDQ.? #x,<ea>, when x is
between -1 and -8.
SUBA.? #x,An translated to LEA (-x,An),An, when x is
between -32767 and 32768.
Some known problems of this module at the moment:
FMOVE immediate addressing modes, but without
specifying a size extension, constants between $80000000 and
$ffffffff are stored with 32 bits, which leads to sign-extension
problems when the instruction is really 64 or 96 bits.
This module has the following error messages:
Next: c16x/st10 cpu module, Previous: m68k cpu module [Contents]
This chapter documents the Backend for the PowerPC microprocessor family.
This module is written in 2002-2016 by Frank Wille and is covered by the vasm copyright without modifications.
This module provides the following additional options:
Select big-endian mode.
Select little-endian mode.
Allow both, 32- and 64-bit instructions.
Generate code for the Altivec unit.
Allow only common PPC instructions.
Generate code for the PPC 601.
Generate code for the 32-bit PowerPC 6xx family.
Generate code for the 64-bit PowerPC 600 family.
Generate code for the 32-bit PowerPC 74xx (G4) family.
Generate code for the 32-bit PowerPC 7450.
Generate code for the IBM/AMCC 32-bit embedded 40x family.
Generate code for the AMCC 32-bit embedded 440/460 family.
Generate code for the 32-bit MPC8xx PowerQUICC I family.
Generate code for the 32-bit Book-E architecture.
Generate code for the 32-bit e300 core (MPC51xx, MPC52xx, MPC83xx).
Generate code for the 32-bit e500 core (MPC85xx), including SPE, EFS and PMR.
Generate code for the POWER family.
Generate code for the POWER2 family.
Don’t predefine any register-name symbols.
Enables translation of 16-bit branches into "B<!cc> $+8 ; B label" sequences when destination is out of range.
Sets the 2nd small data base register to Rn.
Sets small data base register to Rn.
The default setting is to generate code for a 32-bit PPC G2, G3, G4 CPU with Altivec support.
This backend accepts PowerPC instructions as described in the instruction set manuals from IBM, Motorola, Freescale and AMCC.
The full instruction set of the following families is supported: POWER, POWER2, 40x, 44x, 46x, 60x, 620, 750, 74xx, 860, Book-E, e300 and e500.
The target address type is 32 or 64 bits, depending on the selected CPU model. Floating point constants in instructions and data are supported and encoded in IEEE format.
Default alignment for sections and instructions is 4 bytes. Data is aligned to its natural alignment by default.
This backend provides the following specific extensions:
-no-regnames, the registers r0 - r31,
f0 - f31, v0 - v31, cr0 - cr7, vrsave, sp, rtoc, fp, fpscr, xer, lr, ctr,
and the symbols lt, gt, so and un will be predefined on startup and may
be referenced by the program.
This backend extends the selected syntax module by the following directives:
.sdreg <n>Sets the small data base register to Rn.
.sd2reg <n>Sets the 2nd small data base register to Rn.
This backend performs the following optimizations:
B<!cc> $+8 and a 26-bit unconditional branch.
Some known problems of this module at the moment:
This module has the following error messages:
Next: 6502 cpu module, Previous: PowerPC cpu module [Contents]
This chapter documents the Backend for the c16x/st10 microcontroller family.
Note that this module is not yet fully completed!
This module is written in 2002-2004 by Volker Barthelmann and is covered by the vasm copyright without modifications.
This module provides the following additional options:
Do not translate between jump instructions.
If the offset of a jmpr
instruction is too large, it will not be translated to
jmps but an error will be emitted.
Also, jmpa will not be optimized to jmpr.
The pseudo-instruction jmp will still be translated.
A jmp or jmpr instruction that is translated due to
its offset being larger than 8 bits will be translated to a
jmpa rather than a jmps, if possible.
This backend accepts c16x/st10 instructions as described in the Infineon instruction set manuals.
The target address type is 32bit.
Default alignment for sections and instructions is 2 bytes.
This backend provides the following specific extensions:
jmp that will be translated
either to a jmpr or jmpa instruction, depending on
the offset.
sfr pseudo opcode can be used to declare special function
registers. It has two, three of four arguments. The first argument
is the identifier to be declared as special function register.
The second argument is either the 16bit sfr address or its 8bit base
address (0xfe for normal sfrs and
0xf0 for extended special function registers). In the latter case,
the third argument is
the 8bit sfr number. If another argument is given, it specifies the
bit-number in the sfr (i.e. the declaration declares a single bit).
Example:
.sfr zeros,0xfe,0x8e
SEG and SOF can be used to obtain the segment or
segment offset of a full address.
Example:
mov r3,#SEG farfunc
This backend performs the following optimizations:
jmp is translated to jmpr, if possible. Also, if
-no-translations was not specified, jmpr and
jmpa are translated.
jmps instruction or an inverted
jump around a jmps instruction.
gpr,#IMM3/4 and
reg,#IMM16 the smaller form is used, if possible.
Some known problems of this module at the moment:
This module has the following error messages:
Next: ARM cpu module, Previous: c16x/st10 cpu module [Contents]
This chapter documents the backend for the MOS/Rockwell 6502 microprocessor family. It also supports the Rockwell/WDC 65C02, the Hudson Soft HuC6280 and the WDC 65802/65816 instruction sets.
This module is written in 2002,2006,2008-2012,2014-2023 by Frank Wille and is covered by the vasm copyright without modifications.
This module provides the following additional options:
Recognize all HuC6280 instructions.
Same as -816. There is no difference in the instruction set.
Enables the 8/16 bit instruction set for the WDC65816/65802 and additional directives to switch loading of the accumulator and/or the index register between 8 and 16 bits. The target address size is 24 bits.
Swap meaning of < and > unary operators for compatibility
with the BBC ADE assembler.
Recognize all 65C02 instructions. This excludes DTV (-dtv) and illegal (-illegal) instructions.
Enables the Commodore CSG65CE02 instruction set, which extends on the WDC02 instruction set.
Recognize the three additional C64-DTV instructions.
Allow ’illegal’ 6502 instructions to be recognized.
Enables the 45GS02 instruction set for the MEGA65 computer.
Enables translation of B<cc> branches into sequences of
B<!cc> *+5 ; JMP label when necessary. BRA (DTV, 65C02)
is directly translated into a JMP when out of range.
It also performs optimization of JMP to BRA,
whenever possible.
Recognize all 65C02 instructions and the WDC65C02 extensions
(RMB,SMB,BBR,BBS,STP,WAI).
This backend accepts 6502 family instructions as described in the instruction set reference manuals from MOS and Rockwell, which are valid for the following CPUs: 6502 - 6518, 6570, 6571, 6702, 7501, 8500, 8502.
Optionally accepts 65C02 family instructions as described in the instruction set reference manuals from Rockwell and WDC. Also supports the WDC extensions in the W65C02 and W65C134.
Optionally accepts 65CE02 family instructions as described in the instruction set reference manuals from Commodore Semiconductor Group.
Optionally accepts HuC6280 instructions as described in the instruction set reference manuals from Hudson Soft.
Optionally accepts 45GS02 instructions as defined by the Mega65 project.
Optionally accepts WDC65816 insructions as described in the Programming Manual by The Western Design Center.
The target address type is 16 bits, or 24 bits in WDC65816 mode.
Instructions consist of one up to three bytes for the standard 6502 family (up to 7 bytes for the 6280) and require no alignment. There is also no alignment requirement for sections and data.
All known mnemonics for illegal instructions are optionally recognized (e.g.
dcm and dcp refer to the same instruction). Some illegal
insructions (e.g. $ab) are known to show unpredictable behaviour,
or do not always work the same on different CPUs.
Note that the WDC65816 MVN and MVP block move instructions
require a full 24-bit address (or a label) for the source and destination,
as documented in WDC’s Programming Manual. This assembler additionally
allows to specify the bank byte directly, which is triggered by a constant
value between 0 and 255.
This backend provides the following specific extensions:
< is used to select the low-byte
and > for the high-byte. It has to be the first character before
an expression. See also option -bbcade.
In WDC65816 mode the character ^ can be used to select the
bank-byte (bits 16 to 23) of a full 24 bit address.
>>8, /256, %256
or &256 on a label, an appropriate lo/hi-byte relocation will
automatically be generated.
>) or
zero/direct-page 8-bit addressing (lo/<).
In WDC65816 mode the > character selects full 24-bit addressing
instead, and ! or | may be used to enforce 16-bit addressing.
This backend extends the selected syntax module by the following directives:
<symbol> ezp <expr>Works exactly like the equ directive, but marks <symbol>
as a zero page symbol and use zero page addressing whenever
<symbol> is used in a memory addressing mode.
a8Declares that immediate instructions loading the accumulator read 8 bits (default, WDC65816 only).
a16Declares that immediate instructions loading the accumulator read 16 bits (WDC65816 only).
setdp <expr>Set the current base address of the zero/direct page for
optimizations from absolute to zero-page addressing modes.
Example: set it to $2000 for the HuC6280/PC-Engine.
x8Declares that immediate instructions loading the index registers read 8 bits (default, WDC65816 only).
x16Declares that immediate instructions loading the index registers read 16 bits (WDC65816 only).
zeroSwitch to a zero page section called zero or .zero,
which has the type bss with attributes "aurw".
Accesses to symbols from this section will default to zero page
addressing mode.
zpage <symbol1> [,<symbol2>...]Mark symbols as zero page and use zero page addressing for
expressions based on this symbol, unless overridden by a
hi-modifier (>).
All these directives are also available in the form starting with a
dot (.).
This backend performs the following operand optimizations and translations:
B<!cc> *+5 and an absolute JMP instruction
(-opt-branch).
JMP to BRA,
when -opt-branch was given.
Some known problems of this module at the moment:
This module has the following error messages:
Next: 80x86 cpu module, Previous: 6502 cpu module [Contents]
This chapter documents the backend for the Advanced RISC Machine (ARM) microprocessor family.
This module is written in 2004,2006,2010-2015 by Frank Wille and is covered by the vasm copyright without modifications.
This module provides the following additional options:
Generate code compatible with ARM V2 architecture.
Generate code compatible with ARM V3 architecture.
Generate code compatible with ARM V3m architecture.
Generate code compatible with ARM V4 architecture.
Generate code compatible with ARM V4t architecture.
Output big-endian code and data.
Output little-endian code and data (default).
Generate code for the ARM2 CPU.
Generate code for the ARM250 CPU.
Generate code for the ARM3 CPU.
Generate code for the ARM6 CPU.
Generate code for the ARM600 CPU.
Generate code for the ARM610 CPU.
Generate code for the ARM7 CPU.
Generate code for the ARM710 CPU.
Generate code for the ARM7500 CPU.
Generate code for the ARM7d CPU.
Generate code for the ARM7di CPU.
Generate code for the ARM7dm CPU.
Generate code for the ARM7dmi CPU.
Generate code for the ARM7tdmi CPU.
Generate code for the ARM8 CPU.
Generate code for the ARM810 CPU.
Generate code for the ARM9 CPU.
Generate code for the ARM9 CPU.
Generate code for the ARM920 CPU.
Generate code for the ARM920t CPU.
Generate code for the ARM9tdmi CPU.
Generate code for the SA1 CPU.
Generate code for the STRONGARM CPU.
Generate code for the STRONGARM110 CPU.
Generate code for the STRONGARM1100 CPU.
The ADR directive will be automatically converted into
ADRL if required (which inserts an additional
ADD/SUB to calculate an address).
The maximum range in which PC-relative symbols can be accessed
through LDR and STR is extended from +/-4KB to +/-1MB
(or +/-256 Bytes to +/-65536 Bytes when accessing half-words).
This is done by automatically inserting an additional ADD
or SUB instruction before the LDR/STR.
Start assembling in Thumb mode.
This backend accepts ARM instructions as described in various ARM CPU data sheets. Additionally some architectures support a second, more dense, instruction set, called THUMB. There are special directives to switch between these two instruction sets.
The target address type is 32bit.
Default alignment for instructions is 4 bytes for ARM and 2 bytes for THUMB. Sections will be aligned to 4 bytes by default. Data is aligned to its natural alignment by default.
This backend extends the selected syntax module by the following directives:
.armGenerate 32-bit ARM code.
.thumbGenerate 16-bit THUMB code.
This backend performs the following optimizations and translations for the ARM instruction set:
LDR/STR Rd,symbol, with a distance between symbol and PC larger
than 4KB, is translated to
ADD/SUB Rd,PC,#offset&0xff000 +
LDR/STR Rd,[Rd,#offset&0xfff], when allowed by the option
-opt-ldrpc.
ADR Rd,symbol is translated to
ADD/SUB Rd,PC,#rotated_offset8.
ADRL Rd,symbol is translated to
ADD/SUB Rd,PC,#hi_rotated8 + ADD/SUB Rd,Rd,#lo_rotated8.
ADR will be automatically treated as ADRL when required
and when allowed by the option -opt-adr.
For the THUMB instruction set the following optimizations and translations are done:
B<!cc> .+4 + B label.
BL instruction is translated into two sub-instructions combining
the high- and low 22 bit of the branch displacement.
Some known problems of this module at the moment:
This module has the following error messages:
Next: z80 cpu module, Previous: ARM cpu module [Contents]
This chapter documents the Backend for the 80x86 microprocessor family.
This module is written in 2005-2006,2011,2015-2016 by Frank Wille and is covered by the vasm copyright without modifications.
This module provides the following additional options:
Enables debugging output.
Generate code for the 8086 CPU.
Generate code for the 80186 CPU.
Generate code for the 80286 CPU.
Generate code for the 80386 CPU.
Generate code for the 80486 CPU.
Generate code for the Pentium.
Generate code for the PentiumPro.
Generate code for the Pentium.
Generate code for the PentiumPro.
Generate code for the AMD K6.
Generate code for the AMD Athlon.
Generate code for the Sledgehammer CPU.
Generate code for 64-bit architectures (x86_64).
This backend accepts 80x86 instructions as described in the Intel Architecture Software Developer’s Manual.
The target address type is 32 bits. It is 64 bits when the x86_64 architecture was selected (-m64). Floating point constants in instructions and data are supported and encoded in IEEE format.
Instructions do not need any alignment. Data is aligned to its natural alignment by default.
The backend uses AT&T-syntax! This means the left operands are always the source and the right operand is the destination. Register names have to be prefixed by a ’%’.
The operation size is indicated by a ’b’, ’w’, ’l’, etc. suffix directly appended to the mnemonic. The assembler can also determine the operation size from the size of the registers being used.
Predefined register symbols in this backend:
al cl dl bl ah ch dh bh axl cxl dxl spl bpl sil dil r8b r9b r10b r11b r12b r13b r14b r15b
ax cx dx bx sp bp si di r8w r9w r10w r11w r12w r13w r14w r15w
eax ecx edx ebx esp ebp esi edi r8d r9d r10d r11d r12d r13d r14d r15d
rax rcx rdx rbx rsp ebp rsi rdi r8 r9 r10 r11 r12 r13 r14 r15
es cs ss ds fs gs
cr0 cr1 cr2 cr3 cr4 cr5 cr6 cr7 cr8 cr9 cr10 cr11 cr12 cr13 cr14 cr15
dr0 dr1 dr2 dr3 dr4 dr5 dr6 dr7 dr8 dr9 dr10 dr11 dr12 dr13 dr14 dr15
tr0 tr1 tr2 tr3 tr4 tr5 tr6 tr7
mm0 mm1 mm2 mm3 mm4 mm5 mm6 mm7 xmm0 xmm1 xmm2 xmm3 xmm4 xmm5 xmm6 xmm7 xmm8 xmm9 xmm10 xmm11 xmm12 xmm13 xmm14 xmm15
st st(0) st(1) st(2) st(3) st(4) st(5) st(6) st(7)
This backend extends the selected syntax module by the following directives:
.code16Sets the assembler to 16-bit addressing mode.
.code32Sets the assembler to 32-bit addressing mode, which is the default.
.code64Sets the assembler to 64-bit addressing mode.
This backend performs the following optimizations:
Some known problems of this module at the moment:
This module has the following error messages:
Next: 6800 cpu module, Previous: 80x86 cpu module [Contents]
This chapter documents the backend for the 8080/z80/gbz80/64180/RCMx000 microprocessor family.
This module is copyright in 2009 by Dominic Morris.
* Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
This module provides the following additional options:
Turns on 8080/8085 compatibility mode. Any use of z80 (or higher) opcodes will result in an error being generated.
Turns on gbz80 compatibility mode. Any use of non-supported opcodes will result in an error being generated.
Turns on 64180 mode supporting additional 64180 opcodes.
Turns on the older Intel 8080 syntax mode. When this mode is
activated, mnemonics and oprand types from the Intel 8080 syntax
instead of the Zilog Z80 syntax (such as STA 1234h
instead of ld (1234h),a) will be valid. This option can
be used in parallel with -8080 to use both sets of mnemonics,
although this is discouraged, as two instructions (jp and
cp) mean different things in each syntax. In this case,
these instructions will be assembled as the Intel syntax, and a
warning will be emitted.
Turns on Rabbit compatibility mode, generating the correct codes for moved opcodes and supporting the additional Rabbit instructions. In this mode, 8 bit access to the 16 bit index registers is not permitted.
Turns on emulation of some instructions which aren’t available on the Rabbit processors.
Swaps the usage of ix and iy registers. This is useful for compiling generic code that uses an index register that is reserved on the target machine.
Switches on z80asm mode. This translates ASMPC to $ and accepts some pseudo opcodes that z80asm supports. Most emulation of z80asm directives is provided by the oldsyntax syntax module.
This backend accepts z80 family instructions in standard
Zilog syntax. Rabbit opcodes are accepted as defined in the
publicly available reference material from Rabbit Semiconductor,
with the exception that the ljp and lcall opcodes need
to be supplied with a 24 bit number rather than an 8 bit xpc and a
16 bit address.
The target address type is 16 bit.
Instructions consist of one up to six bytes and require no alignment. There is also no alignment requirement for sections and data.
This backend provides the following specific extensions:
altd and/or the
ioi/ioe modifier.
For details of which instructions these are valid for please
see the documentation from Rabbit.
< is used to select the low-byte
and > for the high-byte. It has to be the first character before
an expression.
/256, %256 or &256
on a label, an appropriate lo/hi-byte relocation will automatically be
generated.
This backend supports the emulation of certain z80 instructions on the
Rabbit/gbz80 processor. These instructions are rld, rrd,
cpi, cpir, cpd and cpdr.
The link stage should provide routines with the opcode name prefixed with
rcmx_ (eg rcmx_rld) which implements the same functionality.
Example implementations are available within the z88dk CVS tree.
Additionally, for the Rabbit targets the missing call cc, opcodes
will be emulated.
Some known problems of this module at the moment:
llcall, lljp
are not available).
This module has the following error messages:
Next: 6809/6309/68HC12 cpu module, Previous: z80 cpu module [Contents]
This chapter documents the backend for the Motorola 6800 microprocessor family.
This module is written in 2013-2016,2021 by Esben Norby and Frank Wille and is covered by the vasm copyright without modifications.
This module provides the following additional options:
Generate code for the 6800 CPU (default setting).
Generate code for the 6801 CPU.
Generate code for the 68HC11 CPU.
This backend accepts 6800 family instructions for the following CPUs:
The 6804, 6805 and 68HC08 are not supported, they use a similar instruction set, but are not opcode compatible.
The target address type is 16 bit.
Instructions consist of one up to five bytes and require no alignment. There is also no alignment requirement for sections and data.
This backend provides the following specific extensions:
< character can be used to force direct mode and the
> character forces extended mode. Otherwise the assembler selects
the best mode automatically, which defaults to extended mode for external
symbols.
/256, %256 or &256
on a label, an appropriate lo/hi-byte relocation will automatically be
generated.
None.
Some known problems of this module at the moment:
This module has the following error messages:
Next: Jaguar RISC cpu module, Previous: 6800 cpu module [Contents]
This chapter documents the backend for the Motorola 6809, 68HC12 and Hitachi 6309.
This module is written in 2020-2021 by Frank Wille and is covered by the vasm copyright without modifications.
This module provides the following additional options:
Generate code for the 6809 CPU (default setting). Also works on the 6309, which is backwards compatible.
Generate code for the 6309 CPU.
Generate code for the 68HC12 CPU.
Translate short-branches to long and optimize long-branches
to short when required/possible.
Also tries to optimize jmp and jsr instructions
into short-branches.
Delete zero offsets in indexed addressing modes, when possible.
Convert all extended addressing modes with local or external
labels to indexed, PC-relative addressing. Also translates
absolute jmp/jsr instructions into PC-relative
lbra/lbsr (or better).
This backend accepts 6809/6309 instructions as described in the Motorola 6809 and Hitachi 6309 Programmer’s Reference (Copyright 2009 Darren Atkinson). Optionally supports the 68HC12 instruction set as documented in Motorola’s CPU12 Reference Manual.
The target address type is 16 bit.
Instructions consist of one up to six bytes and require no alignment. There is also no alignment requirement for sections and data.
The backend supports the unary operators < and > to either
select the size of an addressing mode or the LSB/MSB of a 16-bit word.
< enforces direct mode and > enforces extended mode.
< enforces an
8-bit offset, while > enforces a 16-bit offset.
< enforces an 8-bit offset and > enforces a 16-bit offset.
< selects
the LSB of a word and > selects the MSB.
/256, %256 or &256
on a label in immediate addressing modes or data constants,
an appropriate lo/hi-byte relocation will automatically be generated.
In absence of < or > vasm selects the best addressing
mode possible, i.e. the one which requires the least amount of memory
when the symbol value is known, or the one which allows the largest
symbol values, when it is unknown at assembly time.
This backend extends the selected syntax module by the following directives:
setdp <expr>Set the current base address of the direct page. It is used to decide whether an extended addressing mode can be optimized to direct addressing. No effect for 68HC12.
direct <symbol>Tell the assembler to use direct addressing for expressions based on this symbol.
This backend performs the following operand optimizations:
Bcc) are optionally translated into long-branches
(LBcc) when their destination is undefined, in a different
section or out of range. Note that there is no LBSR on the HC12.
LBcc) are optionally optimized into short-branches
(Bcc) when possible.
JMP into BRA and JSR into
BSR when possible (same section, distance representable in 8 bits).
JMP into LBRA and JSR into
LBSR (-opt-pc).
Some known problems of this module at the moment:
MOVx instructions may be
wrong. Needs testing.
This module has the following error messages:
Next: PDP11 cpu module, Previous: 6809/6309/68HC12 cpu module [Contents]
This chapter documents the backend for the Atari Jaguar GPU/DSP RISC processor.
This module is written in 2014-2017,2020,2021 by Frank Wille and is covered by the vasm copyright without modifications.
This module provides the following additional options:
Output big-endian code and data (default).
Output little-endian code and data.
Generate code for GPU or DSP RISC. All instructions are accepted (default).
Generate code for the DSP RISC (part of Jerry).
Generate code for the GPU RISC (part of Tom).
This backend accepts RISC instructions for the GPU or DSP in Atari’s Jaguar custom chip set according to the "Jaguar Technical Reference Manual for Tom & Jerry", Revision 8. Documentation bugs were fixed by using various sources on the net.
The target address type is 32 bits.
Default alignment for instructions is 2 bytes. Data is aligned to its natural alignment by default.
This backend performs the following optimizations and translations for the GPU/DSP RISC instruction set:
load (Rn+0),Rm is optimized to load (Rn),Rm.
store Rn,(Rm+0) is optimized to store Rn,(Rm).
This backend extends the selected syntax module by the following directives (note that a leading dot is optional):
<symbol> ccdef <expression>Allows defining a symbol for the condition codes used in jump
and jr instructions. Must be constant number in the range of
0 to 31 or another condition code symbol.
ccundef <symbol>Undefine a condition code symbol previously defined via ccdef.
dspSelect DSP instruction set.
<symbol> equr <Rn>Define a new symbol named <symbol> and assign the address register
Rn to it. <Rn> may also be another register symbol.
Note that a register symbol must be defined before it can be used.
equrundef <symbol>Undefine a register symbol previously defined via equr.
gpuSelect GPU instruction set.
<symbol> regequ <Rn>Equivalent to equr.
regundef <symbol>Undefine a register symbol previously defined via regequ.
All directives may be optionally preceded by a dot (.), for
compatibility with various syntax modules.
Some known problems of this module at the moment:
MOVEI instruction in little-endian mode is unknown.
NOP instructions
after jumps, or OR instructions to work around hardware bugs,
her/himself.
This module has the following error messages:
Next: Trillek TR3200 cpu module, Previous: Jaguar RISC cpu module [Contents]
This chapter documents the backend for the PDP-11 CPU architecture.
This module is written in 2020 by Frank Wille and is covered by the vasm copyright without modifications.
This module provides the following additional options:
Enables the Extended Instruction Set option (EIS).
Enables the Floating point Instruction Set option (FIS).
Enables additional memory space instructions.
Enables optimization of jmp instructions to br when
possible and translates br instructions to jmp when
required.
It will also translate conditional branches, where the destination
is out of range, into a jmp instruction and a negated
conditional branch over this jmp.
This backend accepts PDP-11 instructions as described in the PDP11/40 Processor Handbook, by Digital Equipment Corporation.
The target address type is 16 bit.
Instructions consist of two up to six bytes and required 16-bit alignment. Data, when not accessed as single bytes, also requires 16-bit alignment.
Some known problems of this module at the moment:
This module has the following error messages:
Next: Interface, Previous: PDP11 cpu module [Contents]
This chapter documents the Backend for the TR3200 cpu.
This module is written in 2014 by Luis Panadero Guardeño and is covered by the vasm copyright without modifications.
This backend accepts TR3200 instructions as described in the TR3200 specification
The target address type is 32 bits.
Default alignment for sections is 4 bytes. Instructions alignment is 4 bytes. Data is aligned to its natural alignment by default, i.e. 2 byte wide data alignment is 2 bytes and 4 byte wide data alignment is 4 byte.
The backend uses TR3200 syntax! This means the left operands are always the
destination and the right operand is the source (except for single operand
instructions). Register names have to be prefixed by a ’%’
(%bp, %r0, etc.)
This means that it should accept WaveAsm assembly files if oldstyle syntax module
is being used. The instructions are lowercase, -dotdir option is being used
and directives are not in the first column.
Predefined register symbols in this backend:
r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15
bp sp y ia flags
Some known problems of this module at the moment:
This module has the following error messages:
It follows a little example to illustrate TR3200 assembly using the oldstyle syntax module (option -dotdir required):
const .equ 0xBEBACAFE ; A constant
an_addr .equ 0x100 ; Other constant
; ROM code
.org 0x100000
.text
_start ; Label with or without a ending ":"
mov %sp, 0x1000 ; Set the initial stack
mov %r0, 0
mov %r1, 0xA5
mov %r2, 0
storeb %r0, an_addr, %r1
add %r0, %r2, %bp
add %r0, %r2, 0
add %r0, %r0, 10
add %r0, %r0, 10h
add %r0, %r0, 0x100010
add %r0, %r0, (256 + 100) ; vasm parses math expressions
mov %r2, 0
mov %r3, 0x10200
loadb %r6, 0x100200
loadb %r1, %r2, 0x100200
loadb %r1, %r2, %r3
loadb %r4, var1
push %r0
.repeat 2 ; directives to repeat stuff!
push const
.endrepeat
.repeat 2
pop %r5
.endrepeat
pop %r0
rcall foo ; Relative call/jump!
sleep
foo: ; Subrutine
ifneq %r5, 0
mul %r5, %r5, 2
sub %r5, %r5, 1
ret
; ROM data
.org 0x100500
var1 .db 0x20 ; A byte size variable
.even ; Enforce to align to even address
var3 .dw 0x1020 ; A word size variable
var4 .dd 0x0A0B0C20 ; A double word size variable
str1 .asciiz "Hello world!" ; ASCII string with null termination
str2 .string "Hello world!" ; ASCII string with null termination
.fill 5, 0xFF ; Fill 5 bytes with 0xFF
.reserve 5 ; Reserves space for 5 byte
Previous: Trillek TR3200 cpu module [Contents]
This chapter is under construction!
This chapter describes some of the internals of vasm
and tries to explain
what has to be done to write a cpu module, a syntax module
or an output module for vasm.
However if someone wants to write one, I suggest to contact me first,
so that it can be integrated into the source tree.
Note that this documentation may mention explicit values when introducing symbolic constants. This is due to copying and pasting from the source code. These values may not be up to date and in some cases can be overridden. Therefore do never use the absolute values but rather the symbolic representations.
This section deals with the steps necessary to build the typical
vasm executable from the sources.
The vasm-directory contains the following important files and directories:
The main directory containing the assembler sources.
The Makefile used to build vasm.
Directories for the syntax modules.
Directories for the cpu modules.
Directory the object modules will be stored in.
All compiling is done from the main directory and
the executables will be placed there as well.
The main assembler for a combination of <cpu> and
<syntax> will be called vasm<cpu>_<syntax>.
All output modules are usually integrated in every executable
and can be selected at runtime. Otherwise you have to adapt
the OUTFMTS definition in make.rules and select
those you want.
Before building anything you have to insert correct values for your compiler and operating system in the Makefile.
TARGETHere you may define an extension which is appended to the executable’s name. Useful, if you build various targets in the same directory.
TARGETEXTENSIONDefines the file name extension for executable files. Not needed for most operating systems. For Windows it would be .exe.
CCHere you have to insert a command that invokes an ANSI C
compiler you want to use to build vasm. It must support
the -I option in the same way like e.g. vc or
gcc.
COPTSHere you will usually define an option like -c to instruct the compiler to generate an object file. Additional options, like the optimization level, should also be inserted here as well. Specifying the host OS helps to determine work-directories for DWARF and defines the appropriate internal symbol for the host’s file system path style. The following are supported:
-DAMIGAAmigaOS (M68k or PPC), MorphOS, AROS.
Defines the internal symbol __AMIGAFS.
-DATARIAtari TOS.
Defines the internal symbol __MSDOSFS.
-DMSDOSCP/M, MS-DOS, Windows.
Defines the internal symbol __MSDOSFS.
-DUNIXAll kinds of Unix (Linux, BSD) including MacOSX and Atari-MiNT.
Defines the internal symbol __UNIXFS.
-D_WIN32Windows.
Defines the internal symbol __MSDOSFS.
Building without specifying a host-OS is allowed. Then vasm defaults to Unix-style path handling and will not define a file system symbol for conditional assembly. Other options:
-DLOWMEMBuilds for a host-OS with a low amount of memory. This will basically reduce all hash tables to minimal size.
CCOUTHere you define the option which is used to specify the name of an output file, which is usually -o.
LDHere you insert a command which starts the linker. This may be the
the same as under CC.
LDFLAGSHere you have to add options which are necessary for linking. E.g. some compilers need special libraries for floating-point.
LDOUTHere you define the option which is used by the linker to specify the output file name.
RMSpecify a command to delete a file, e.g. rm -f.
An example for the Amiga using vbcc would be:
TARGET = _os3
TARGETEXTENSION =
CC = vc +aos68k
CCOUT = -o
COPTS = -c -c99 -cpu=68020 -DAMIGA -O1
LD = $(CC)
LDOUT = $(CCOUT)
LDFLAGS = -lmieee
RM = delete force quiet
An example for a typical Unix-installation would be:
TARGET =
TARGETEXTENSION =
CC = gcc
CCOUT = -o
COPTS = -c -O2
LD = $(CC)
LDOUT = $(CCOUT)
LDFLAGS = -lm
RM = rm -f
Open/Net/Free/Any BSD systems will probably require an additional
-D_ANSI_SOURCE in COPTS.
Note to users of BSD systems: You will probably have to use GNU make instead of BSD make, i.e. in the following examples replace "make" with "gmake".
Type:
make CPU=<cpu> SYNTAX=<syntax>
For example:
make CPU=ppc SYNTAX=std
The following CPU modules can be selected:
CPU=6502
CPU=6800
CPU=6809
CPU=arm
CPU=c16x
CPU=jagrisc
CPU=m68k
CPU=pdp11
CPU=ppc
CPU=qnice
CPU=test
CPU=tr3200
CPU=vidcore
CPU=x86
CPU=z80
The following syntax modules can be selected:
SYNTAX=std
SYNTAX=mot
SYNTAX=madmac
SYNTAX=oldstyle
SYNTAX=test
For Windows and various Amiga targets there are already Makefiles included,
which you may either copy on top of the default Makefile, or call
it explicitly with make’s -f option:
make -f Makefile.OS4 CPU=ppc SYNTAX=std
Important global variables, which may be read or modified by syntax-, cpu- or output-modules.
char *inname;Input source file name.
char *outname;Output source file name.
int exec_out;Non-zero, when the output file is an executable and not an object file.
source *cur_src;Pointer to the current source text instance (see structures below).
char *defsectname;Name of a default section which vasm creates when a label or code occurs
in the source without any preceding section or org directive.
Assigning NULL means that the default is an absolute section and its
base address is taken from defsectorg.
char *defsecttype;Attributes of the default section (see above). May be NULL to indicate that no default has been defined and vasm will show an error.
taddr defsectorg;Used when defsectname==NULL. Defines the base address of a default
absolute org section.
char emptystr[];An empty string (zero length).
This section describes the fundamental data structures used in vasm which are usually necessary to understand for writing any kind of module (cpu, syntax or output). More detailed information is given in the respective sections on writing specific modules where necessary.
A source structure represents a source text module, which can be either the main source text, an included file or a macro. There is always a link to the parent source from where the current source context was included or called.
struct source *parent;Pointer to the parent source context. Assembly continues there when the current source context ends.
int parent_line;Line number in the parent source context, from where we were called. This information is needed, because line numbers are only reliable during parsing and later from the atoms. But an include directive doesn’t create an atom.
struct source_file *srcfile;The source_file structure has the unique file name, index
and text-pointer for this source text instance.
Used for debugging output, like DWARF.
char *name;File name of the main source or include file, or macro name.
char *text;Pointer to the source text start.
size_t size;Size of the source text to assemble in bytes.
struct source *defsrc;This is a NULL-pointer for real source text files. Otherwise
it is a reference to the source which defines the current macro
or repetition.
int defline;Valid when defsrc is not NULL. Contains the starting
line number of a macro or repetition in a source text file.
macro *macro;Pointer to macro structure, when currently inside a macro
(see also num_params).
unsigned long repeat;Number of repetitions of this source text. Usually this is 1, but
for text blocks between a rept and endr directive
it allows any number of repetitions, which is decremented every time
the end of this source text block is reached.
char *irpname;Name of the iterator symbol in special repeat loops which use a
sequence of arbitrary values, being assigned to this symbol within
the loop. Example: irp directive in std-syntax.
struct macarg *irpvals;A list of arbitrary values to iterate over in a loop. With each iteration the frontmost value is removed from the list until it is empty.
int cond_level;Current level of conditional nesting while entering this source
text. It is automatically restored to the previous level when
leaving the source prematurely through end_source().
struct macarg *argnames;The current list of named macro arguments.
int num_params;Number of macro parameters passed at the invocation point from the parent source. For normal source files this entry will be -1. For macros 0 (no parameters) or higher.
char *param[MAXMACPARAMS];Pointer to the macro parameters.
int param_len[MAXMACPARAMS];Number of characters per macro parameter.
int num_quals;(If MAX_QUALIFIERS!=0.) Number of qualifiers for a macro.
when not passed on invocation these are the default qualifiers.
char *qual[MAX_QUALIFIERS];(If MAX_QUALIFIERS!=0.) Pointer to macro qualifiers.
int qual_len[MAX_QUALIFIERS];(If MAX_QUALIFIERS!=0.) Number of characters per macro qualifier.
unsigned long id;Every source has its unique id. Useful for macros supporting
the special \@ parameter.
char *srcptr;The current source text pointer, pointing to the beginning of the next line to assemble.
int line;Line number in the current source context. After parsing the line number of the current atom is stored here.
size_t bufsize;Current size of the line buffer (linebuf). The size of the
line buffer is extended automatically, when an overflow happens.
char *linebuf;A buffer for the current line being assembled in this source text. A child-source, like a macro, can refer to arguments from this buffer, so every source has got its own. When returning to the parent source, the linebuf is deallocated to save memory.
expr *cargexp;(If CARGSYM was defined.) Pointer to the current expression
assigned to the CARG-symbol (used to select a macro argument) in
this source instance. So it can be restored when reentering this
instance.
long reptn;(If REPTNSYM was defined.) Current value of the repetition
counter symbol in this source instance. So it can be restored when
reentering this instance.
One of the top level structures is a linked list of sections describing
continuous blocks of memory. A section is specified by an object of
type section with the following members that can be accessed by
the modules:
struct section *next;A pointer to the next section in the list.
char *name;The name of the section.
char *attr;A string describing the section flags in ELF notation (see,
for example, documentation of the .section directive of
the standard syntax module.
atom *first;atom *last;Pointers to the first and last atom of the section. See following sections for information on atoms.
taddr align;Alignment of the section in bytes.
uint32_t flags;Flags of the section. Currently available flags are:
HAS_SYMBOLSAt least one symbol is defined in this section.
RESOLVE_WARNThe current atom changed its size multiple times, so atom_size()
is now called with this flag set in its section to make the
backend (e.g. instruction_size()) aware of it and do less
aggressive optimizations.
UNALLOCATEDSection is unallocated, which means it doesn’t use any memory space
in the output file. Such a section will be removed before creating
the output file and all its labels converted into absolute expression
symbols. Used for "offset" sections. Refer to
switch_offset_section().
LABELS_ARE_LOCALAs long as this flag is set new labels in a section are defined as local labels, with the section name as global parent label.
ABSOLUTESection is loaded at an absolute address in memory.
PREVABSRemembers state of the ABSOLUTE flag before entering
relocated-org mode (IN_RORG). So it can be restored later.
IN_RORGSection has entered relocated-org mode, which also sets the
ABSOLUTE flag. In this mode code is written into the current
section, but relocated to an absolute address. No relocation
information are generated.
NEAR_ADDRESSINGSection is marked as suitable for cpu-specific "near" addressing modes. For example, base-register relative or zero/direct-page. The cpu backend can use this information as an optimization hint when referencing symbols from this section.
FAR_ADDRESSINGSection requires cpu-specific "far" addressing modes. For example an addressing mode including the bank or "segment". The cpu backend may use this information to select appropriate addressing modes when referencing symbols from this section.
taddr org;Start address of a section. Usually zero.
taddr pc;Current address in this section. Can be used
while traversing through the section. Has to be updated by a
module using it. Is set to org at the beginning.
unsigned long idx;A member usable by the output module for private purposes.
Symbols are represented by a linked list of type symbol with the
following members that can be accessed by the modules:.
int type;Type of the symbol. Available are:
#define LABSYM 1The symbol is a label defined at a specific location.
#define IMPORT 2The symbol is externally defined.
#define EXPRESSION 3The symbol is defined using an expression.
uint32_t flags;Flags of this symbol. Available are:
#define TYPE_UNKNOWN 0The symbol has no type information.
#define TYPE_OBJECT 1The symbol defines an object.
#define TYPE_FUNCTION 2The symbol defines a function.
#define TYPE_SECTION 3The symbol defines a section.
#define TYPE_FILE 4The symbol defines a file.
#define EXPORT (1<<3)The symbol is exported to other object files.
#define INEVAL (1<<4)Used internally.
#define COMMON (1<<5)The symbol is a common symbol.
#define WEAK (1<<6)The symbol is weak, which means the linker may overwrite it with any global definition of the same name. Weak symbols may also stay undefined, in which case the linker would assign them a value of zero.
#define LOCAL (1<<7)Only informational. A symbol can be explicitly declared as local by a syntax-module directive.
#define VASMINTERN (1<<8)Vasm-internal symbol, which is usually not exported into an object file.
#define PROTECTED (1<<9)Used internally to protect the current-PC symbol from deletion.
#define REFERENCED (1<<10)Symbol was referenced in the source and a relocation entry has been created.
#define ABSLABEL (1<<11)Label was defined inside an absolute section, or during relocated-org mode. So it has an absolute address and will not generate a relocation entry when being referenced.
#define EQUATE (1<<12)Symbols flagged as EQUATE are constant and its value must
not be changed.
#define REGLIST (1<<13)Symbol is a register list definition.
#define USED (1<<14)Symbol appeared in an expression. Symbols which were only defined, (as label or equate) and never used throughout the whole source, don’t get this flag set.
#define NEAR (1<<15)Symbol may be referenced by "near" addressing mode. For example, base register relative. Used as an optimization hint in the cpu backend.
#define XDEF (1<<16)This symbol must become defined in the source. Which means
its type must not remain IMPORT. Otherwise a
warning is displayed.
#define XREF (1<<17)Symbol is externally defined and its type must never
become something else than IMPORT. Otherwise an error
is displayed.
#define RSRVD_S (1L<<24)The range from bit 24 to 27 (counted from the LSB) is reserved for use by the syntax module.
#define RSRVD_O (1L<<28)The range from bit 28 to 31 (counted from the LSB) is reserved for use by the output module.
The type-flags can be extracted using the TYPE() macro which
expects a pointer to a symbol as argument.
char *name;The name of the symbol.
expr *expr;The expression in case of EXPRESSION symbols.
expr *size;The size of the symbol, if specified.
section *sec;The section a LABSYM symbol is defined in.
taddr pc;The address of a LABSYM symbol.
taddr align;The alignment of the symbol in bytes.
unsigned long idx;A member usable by the output module for private purposes.
Optional register symbols are available when the backend defines
HAVE_REGSYMS in cpu.h together with the hash table size.
Example:
#define HAVE_REGSYMS #define REGSYMHTSIZE 256
A register symbol is defined by an object of type regsym
with the following members that can be accessed by the modules:
char *reg_name;Symbol name.
int reg_type;Optional type of register.
unsigned int reg_flags;Optional register symbol flags.
unsigned int reg_num;Register number or value.
Refer to symbol.h for functions to create and find register symbols.
The contents of each section are a linked list built out of non-separable atoms. The general structure of an atom is:
struct atom {
struct atom *next;
int type;
taddr align;
taddr lastsize;
unsigned changes;
source *src;
int line;
listing *list;
union {
instruction *inst;
dblock *db;
symbol *label;
sblock *sb;
defblock *defb;
void *opts;
int srcline;
char *ptext;
printexpr *pexpr;
expr *roffs;
taddr *rorg;
assertion *assert;
aoutnlist *nlist;
} content;
};
The members have the following meaning:
struct atom *next;Pointer to the following atom (NULL if last).
int type;The type of the atom. Can be one of
#define VASMDEBUG 0Used for internal debugging.
#define LABEL 1A label is defined here.
#define DATA 2Some data bytes of fixed length and constant data are put here.
#define INSTRUCTION 3Generally refers to a machine instruction or pseudo/opcode. These atoms
can change length during optimization passes and will be translated to
DATA-atoms later.
#define SPACE 4Defines a block of data filled with one value of a given size (up to
MAXBYTES bytes). BSS sections usually contain only such atoms,
but they are also sometimes useful as shorter versions of
DATA-atoms in other sections.
#define DATADEF 5Defines data of fixed size which can contain cpu specific operands and
expressions. Usually generated by data in a source text, which are no
machine instructions. Will be translated to DATA-atoms later.
#define LINE 6A source text line number (usually from a high level language) is bound to the atom’s address. Useful for source level debugging in certain ABIs.
#define OPTS 7A means to change assembler options at a specific source text line.
For example optimization settings, or the cpu type to generate code for.
The cpu backend has to define HAVE_CPU_OPTS and export the required
functions if it wants to use this type of atom.
#define PRINTTEXT 8A string is printed to stdout during the final assembler pass. A newline is automatically appended.
#define PRINTEXPR 9Prints the value of an expression during the final assembler pass to stdout.
#define ROFFS 10Set the program counter to an address relative to the section’s start
address. These atoms will be translated into SPACE atoms in the
final pass.
#define RORG 11Assemble this block under the given base address, while the code is still written into the original memory region.
#define RORGEND 12Ends a RORG block and returns to the original addressing.
#define ASSERT 13The assertion expression is checked in the final pass and an error message is generated (using the expression string and an optional message out of this atom) when it evaluates to 0.
#define NLIST 14Defines a stab-entry for the a.out object file format. nlist-style stabs can also occur embedded in other object file formats, like ELF.
taddr align;The alignment of this atom. Address must be dividable by align.
taddr lastsize;The size of this atom in the last resolver pass. When the size has changed in the current pass, the assembler will request another resolver run through the section.
unsigned changes;Number of changes in the size of this atom since pass number
FASTOPTPHASE. An increasing number usually indicates a problem in
the cpu backend’s optimizer and will be flagged by setting
RESOLVE_WARN in the Section flags, as soon as changes exceeds
MAXSIZECHANGES. So the backend can choose not to optimize this atom
as aggressive as before.
source *src;Pointer to the source text object to which this atom belongs.
int line;The source line number that created this atom.
listing *list;Pointer to the listing file object to which this atom belongs.
instruction *inst;(In union content.) Pointer to an instruction structure in the case
of an INSTRUCTION-atom. Contains the following elements:
int code;The cpu specific code of this instruction.
char *qualifiers[MAX_QUALIFIERS];(If MAX_QUALIFIERS!=0.) Pointer to the qualifiers of this instruction.
operand *op[MAX_OPERANDS];(If MAX_OPERANDS!=0.) The cpu-specific operands of this instruction.
instruction_ext ext;(If the cpu backend defines HAVE_INSTRUCTION_EXTENSION.)
A cpu-specific structure. Typically used to store appropriate
opcodes, allowed addressing modes, supported cpu derivates etc.
dblock *db;(In union content.) Pointer to a dblock structure in the case
of a DATA-atom. Contains the following elements:
taddr size;The number of bytes stored in this atom.
char *data;A pointer to the data.
rlist *relocs;A pointer to relocation information for the data.
symbol *label;(In union content.) Pointer to a symbol structure in the case
of a LABEL-atom.
sblock *sb;(In union content.) Pointer to a sblock structure in the case
of a SPACE-atom. Contains the following elements:
size_t space;The number of space-elements (see below) to generate here.
expr *space_exp;The above size as an expression, which will be evaluated during assembly
and copied to space in the final pass.
size_t size;The size of each space-element and of the fill-pattern in bytes.
unsigned char fill[MAXBYTES];The fill pattern, up to MAXBYTES bytes.
expr *fill_exp;Optional. Evaluated and copied to fill in the final pass, when not null.
rlist *relocs;A pointer to relocation information for the space.
taddr maxalignbytes;An optional number of maximum padding bytes to fulfil the atom’s alignment requirement. Zero means there is no restriction.
uint32_t flags;SPC_UNINITIALIZEDThis space is completely uninitialized. May be used as a hint by output modules.
SPC_DATABSSThe output module should not allocate any file space for this atom, when possible (example: DataBss section, as supported by the "hunkexe" output file format). It is not needed to set this flag when the output section is BSS.
defblock *defb;(In union content.) Pointer to a defblock structure in the case
of a DATADEF-atom. Contains the following elements:
taddr bitsize;The size of the definition in bits.
operand *op;Pointer to a cpu-specific operand structure.
void *opts;(In union content.) Points to a cpu-backend specific options object
in the case of a OPTS-atom.
int srcline;(In union content.) Line number for source level debugging in the
case of a LINE-atom.
char *ptext;(In union content.) A string to print to stdout in case of a
PRINTTEXT-atom.
printexpr *pexpr;(In union content.) Pointer to a printexpr structure in the case of
a PRINTEXPR-atom. Contains the following elements:
expr *print_exp;Pointer to an expression to evaluate and print.
short type;Format type of the printed value. We can print as hexadecimal
(PEXP_HEX), signed decimal (PEXP_SDEC),
unsigned decimal (PEXP_UDEC), binary (PEXP_BIN) OR
ASCII (PEXP_ASC).
short size;Size (precision) of the printed value in bits. Excessive bits will be masked out, and sign-extended when requested.
expr *roffs;(In union content.) The expression holds the relative section offset
to align to in case of a ROFFS-atom.
taddr *rorg;(In union content.) Assemble the code under the base address in
rorg in case of a RORG-atom.
assertion *assert;(In union content.) Pointer to an assertion structure in the case of
an ASSERT-atom. Contains the following elements:
expr *assert_exp;Pointer to an expression which should evaluate to non-zero.
char *exprstr;Pointer to the expression as text (to be used in the output).
char *msgstr;Pointer to the message, which would be printed when assert_exp evaluates
to zero.
aoutnlist *nlist;(In union content.) Pointer to an nlist structure, describing an
aout stab entry, in case of an NLIST-atom. Contains the following
elements:
char *name;Name of the stab symbol.
int type;Symbol type. Refer to stabs.h for definitions.
int other;Defines the nature of the symbol (function, object, etc.).
int desc;Debugger information.
expr *value;Symbol’s value.
DATA and SPACE atoms can have a relocation list attached
that describes how this data must be modified when linking/relocating.
They always refer to the data in this atom only.
There are a number of predefined standard relocations and it is possible to add other cpu-specific relocations. Note however, that it is always preferable to use standard relocations, if possible. Chances that an output module supports a certain relocation are much higher if it is a standard relocation.
A relocation list uses this structure:
typedef struct rlist {
struct rlist *next;
void *reloc;
int type;
} rlist;
Type identifies the relocation type. All the standard relocations have
type numbers between FIRST_STANDARD_RELOC and
LAST_STANDARD_RELOC. Consider reloc.h to see which
standard relocations are available.
The detailed information can be accessed
via the pointer reloc. It will point to a structure that depends
on the relocation type, so a module must only use it if it knows the
relocation type.
All standard relocations point to a type nreloc structure
with the following members:
size_t byteoffset;Offset in bytes, from the start of the current DATA atom, to the
beginning of the relocation field. This may also be the address which is
used as a basis for PC-relative relocations. Or a common basis for several
separated relocation fields, which will be translated into a single
relocation type by the output module.
size_t bitoffset;Offset in bits to the beginning of the relocation field, adds to
byteoffset*bitsperbyte. Bits are counted in a bit-stream from lower
to higher address bytes. But note, that inside a little-endian byte they
are counted from the LSB to the MSB, while they are counted from the MSB to
the LSB for big-endian targets.
int size;The size of the relocation field in bits.
taddr mask;The mask defines which portion of the relocated value is set by this relocation field.
taddr addend;Value to be added to the symbol value.
symbol *sym;The symbol referred by this relocation
To describe the meaning of these entries, we will define the steps that shall be executed when performing a relocation:
size bits from the data atom, starting with bit
number byteoffset*bitsperbyte+bitoffset. We start counting
bits from the lowest to the highest numbered byte in memory.
Inside a big-endian byte we count from the MSB to the LSB. Inside
a little-endian byte we count from the LSB to the MSB.
sym plus
the addend. For other relocation types, more complex
calculations will be needed.
For example, in a program-counter relative relocation,
the value will be obtained by subtracting the address of the data
atom plus byteoffset from the value
of sym plus addend.
mask value.
mask.
size bits of this value into the data atom
starting with bit byteoffset*bitsperbyte+bitoffset.
Each module can provide a list of possible error messages contained
e.g. in syntax_errors.h or cpu_errors.h. They are a
comma-separated list of a printf-format string and error flags. Allowed
flags are WARNING, ERROR, FATAL, MESSAGE and
NOLINE.
They can be combined using or (|). NOLINE has to be set for
error messages during initialization or while writing the output, when
no source text is available. Errors cause the assembler to return false.
FATAL causes the assembler to terminate
immediately.
The errors can be emitted using the function syntax_error(int n,...),
cpu_error(int n,...) or output_error(int n,...). The first
argument is the number of the error message (starting from zero). Additional
arguments must be passed according to the format string of the
corresponding error message.
A new syntax module must have its own subdirectory under vasm/syntax. At least the files syntax.h, syntax.c and syntax_errors.h must be written.
#define ISIDSTART(x)/ISIDCHAR(x)These macros should return non-zero if and only if the argument is a
valid character to start an identifier or a valid character inside an
identifier, respectively.
ISIDCHAR must be a superset of ISIDSTART.
#define ISBADID(p,l)Even with ISIDSTART and ISIDCHAR checked, there may be
combinations of characters which do not form a valid initializer (for
example, a single character). This macro returns non-zero, when this is
the case. First argument is a pointer to the new identifier and second
is its length.
#define ISEOL(x)This macro returns true when the string pointing at x is either
a comment character or end-of-line.
#define CHKIDEND(s,e) chkidend((s),(e))Defines an optional function to be called at the end of the identifier
recognition process. It allows you to adjust the length of the identifier
by returning a modified e. Default is to return e. The
function is defined as char *chkidend(char *startpos,char *endpos).
#define BOOLEAN(x) -(x)Defines the result of boolean operations. Usually this is (x), as
in C, or -(x) to return -1 for True.
#define NARGSYM "NARG"Defines the name of an optional symbol which contains the number of arguments in a macro.
#define CARGSYM "CARG"Defines the name of an optional symbol which can be used to select a
specific macro argument with \., \+ and \-.
#define REPTNSYM "REPTN"Defines the name of an optional symbol containing the counter of the current repeat iteration.
#define EXPSKIP() s=exp_skip(s)Defines an optional replacement for skip() to be used in expr.c, to skip
blanks in an expression. Useful to forbid blanks in an expression and to
ignore the rest of the line (e.g. to treat the rest as comment). The
function is defined as char *exp_skip(char *stream).
#define IGNORE_FIRST_EXTRA_OP 1Should be defined as non-zero (true) when the syntax module wants to ignore the operand field on instructions without an operand. Useful, when everything following an operand should be regarded as comment, without a comment character.
#define MAXMACPARAMS 35Optionally defines the maximum number of macro arguments, if you need more than the default number of 9.
#define SKIP_MACRO_ARGNAME(p) skip_identifier(p)An optional function to skip a named macro argument in the macro definition. Argument is the current source stream pointer. The default is to skip an identifier.
#define MACRO_ARG_OPTS(m,n,a,p) NULLAn optional function to parse and skip options, default values and
qualifiers for each macro argument. Returns NULL when no argument
options have been found.
Arguments are:
struct macro *m;Pointer to the macro structure being currently defined.
int n;Argument index, starting with zero.
char *a;Name of this argument.
char *p;Current source stream pointer. An updated pointer will be returned.
Defaults to unused.
#define MACRO_ARG_SEP(p) (*p==',' ? skip(p+1) : NULL)An optional function to skip a separator between the macro argument names in the macro definition. Returns NULL when no valid separator is found. Argument is the current source stream pointer. Defaults to using comma as the only valid separator.
#define MACRO_PARAM_SEP(p) (*p==',' ? skip(p+1) : NULL)An optional function to skip a separator between the macro parameters in a macro call. Returns NULL when no valid separator is found. Argument is the current source stream pointer. Defaults to using comma as the only valid separator.
#define EXEC_MACRO(s)An optional function to be called just before a macro starts execution.
Parameters and qualifiers are already parsed.
Argument is the source pointer of the new macro.
Defaults to unused.
A syntax module has to provide the following elements (all other functions
should be static to prevent name clashes):
const char *syntax_copyright;A string that will be emitted as part of the copyright message.
hashtable *dirhash;A pointer to the hash table with all directives.
char commentchar;A character used to introduce a comment until the end of the line.
int dotdirs;Define dotdirs as non-zero, when the syntax module works with
directives starting with a dot (.).
int init_syntax(void);Will be called during startup, after argument parsing Must return zero if initializations failed, non-zero otherwise.
int syntax_args(char *);This function will be called with the command line arguments (unless they were already recognized by other modules). If an argument was recognized, return non-zero.
int syntax_defsect(void);Lets the syntax module define a default section, which is used when no
section was created by any section or org directive before
the first code or data is defined.
May set defsectname, defsecttype and defsectorg
accordingly and return with non-zero. Or return with zero and accept
the defaults, which are: defsectname=".text" and
defsecttype="acrx".
char *skip(char *);A function to skip whitespace etc.
void eol(char *);This function should check that the argument points to the end of a line (only comments or whitespace following). If not, an error or warning message should be omitted.
char *const_prefix(char *,int *);Check if the first argument points to the start of a constant. If yes return a pointer to the real start of the number (i.e. skip a prefix that may indicate the base) and write the base of the number through the pointer passed as second argument. Return zero if it does not point to a number.
char *const_suffix(char *,char *);First argument points to the start of the constant (including prefix) and the second argument to first character after the constant (excluding suffix). Checks for a constant-suffix and skips it. Return pointer to the first character after that constant. Example: constants with a ’h’ suffix to indicate a hexadecimal base.
void parse(void);This is the main parsing function. It has to read lines via
the read_next_line() function, parse them and create sections,
atoms and symbols. Pseudo directives are usually handled by the syntax
module. Instructions can be parsed by the cpu module using
parse_instruction().
char *parse_macro_arg(struct macro *,char *,struct namelen *,struct namelen *);Called to parse a macro parameter by using the source stream pointer in
the second argument. The start pointer and length of a single passed
parameter is written to the first struct namelen, while the optionally
selected named macro argument is passed in the second struct namelen.
When the len field of the second namelen is zero, then the
argument is selected by position instead by name. Returns the updated
source stream pointer after successful parsing.
int expand_macro(source *,char **,char *,int);Expand parameters and special commands inside a macro source. The second
argument is a pointer to the current source stream pointer, which is
updated on any successful expansion. The function will return the
number of characters written to the destination buffer (third argument)
in this case. Returning -1 means: no expansion took place.
The last argument defines the space in characters which is left in the
destination buffer.
char *get_local_label(char **);Gets a pointer to the current source pointer. Has to check if a valid local label is found at this point. If yes return a pointer to the vasm-internal symbol name representing the local label and update the current source pointer to point behind the label.
Have a look at the support functions provided by the frontend to help.
A new cpu module must have its own subdirectory under vasm/cpus. At least the files cpu.h, cpu.c and cpu_errors.h must be written.
A cpu module has to provide the following elements (all other functions
should be static to prevent name clashes) in cpu.h:
#define MAX_OPERANDS 3Maximum number of operands of one instruction.
#define MAX_QUALIFIERS 0Maximum number of mnemonic-qualifiers per mnemonic.
#define NO_MACRO_QUALIFIERSDefine this, when qualifiers shouldn’t be allowed for macros. For some architectures, like ARM, macro qualifiers make no sense.
typedef int32_t taddr;Data type to represent a target-address. Preferably use the types from
stdint.h. Does not necessarily have to match the cpu’s address
bus size (refer to bytespertaddr), but the largest data you will
be able to do calculations with. For example, you may want to allow 32-bit
data definitions for an 8-bit cpu.
typedef uint32_t utaddr;Unsigned data type to represent a target-address.
#define LITTLEENDIAN 1#define BIGENDIAN 0Define these according to the target endianness. For CPUs which support big-
and little-endian, you may assign a global variable here. So be aware of
it, and never use #if BIGENDIAN, but always if(BIGENDIAN) in
your code.
#define VASM_CPU_<cpu> 1Defines a cpu-specific macro. May be used to perform special handling in syntax- or output-modules.
#define INST_ALIGN 2Minimum instruction alignment.
#define DATA_ALIGN(n) ...Default alignment for n-bit data. Can also be a function.
#define DATA_OPERAND(n) ...Operand class for n-bit data definitions. Can also be a function. Negative values denote a floating point data definition of -n bits.
typedef ... operand;Structure to store an operand for a machine instruction or a data constant.
typedef ... mnemonic_extension;Mnemonic extension for the cpu’s instruction table.
Optional features, which can be enabled by defining the following macros:
#define FLOAT_PARSER 1Enables the floating point parser and floating point evalulation in the
expression module. With this option the backend has to be prepared that
expressions may contain floating point constants, which can be checked
by testing the result of type_of_expr(expression) for FLT.
Then use eval_expr_float(expression,&float_val) to retrieve the
floating point value with type tfloat.
It is up to the backend to convert the host’s floating point format,
which should be IEEE, into the backend’s native format. The vasm frontend
only supports IEEE to IEEE conversion via conv2ieee32() and
conv2ieee64().
#define HAVE_INSTRUCTION_EXTENSION 1If cpu-specific data should be added to all instruction atoms.
typedef ... instruction_ext;Type for the above extension.
#define CLEAR_OPERANDS_ON_START 1Backend requires zeroed operand structures when calling parse_operand()
for the first time. Might be useful to parse operands only once.
Defaults to undefined.
#define CLEAR_OPERANDS_ON_MNEMO 1Backend requires zeroed operand structures when calling parse_operand()
for any new mnemonic. Useful to parse the same operand multiple times on
the current mnemonic, but reset everything for the next mnemonic.
Defaults to undefined.
START_PARENTH(x)Valid opening parenthesis for instruction operands. Defaults to '('.
END_PARENTH(x)Valid closing parenthesis for instruction operands. Defaults to ')'.
#define MNEMONIC_VALID(i)An optional function with the arguments (int idx). Returns true
when the mnemonic with index idx is valid for the current state of
the backend (e.g. it is available for the selected cpu architecture).
#define MNEMOHTABSIZE 0x4000You can optionally overwrite the default hash table size defined in vasm.h. May be necessary for larger mnemonic tables.
#define OPERAND_OPTIONAL(p,t)When defined, this is a function with the arguments
(operand *op,int type), which returns true when the given operand
type (type) is optional. The function is only called for missing
operands and should also initialize op with default values (e.g. 0).
Implementing additional target-specific unary operations is done by defining the following optional macros:
#define EXT_UNARY_NAME(s)Should return True when the string in s points to an operation name
we want to handle.
#define EXT_UNARY_TYPE(s)Returns the operation type code for the string in s. Note that the
last valid standard operation is defined as LAST_EXP_TYPE, so the
target-specific types will start with LAST_EXP_TYPE+1.
#define EXT_UNARY_EVAL(t,v,r,c)Defines a function with the arguments (int t, taddr v, taddr *r, int c)
to handle the operation type t returning an int to indicate
whether this type has been handled or not. Your operation will by applied on
the value v and the result is stored in *r. The flag c
is passed as 1 when the value is constant (no relocatable addresses involved).
#define EXT_FIND_BASE(b,e,s,p)Defines a function with the arguments
(symbol **b, expr *e, section *s, taddr p)
to save a pointer to the base symbol of expression e into the
symbol pointer, pointed to by b. The type of this base is given
by an int return code. Further on, e->type has to checked
to be one of the operations to handle.
The section pointer s and the current pc p are needed to call
the standard find_base() function.
A cpu module has to provide the following elements (all other functions
and data should be static to prevent name clashes) in cpu.c:
int bitsperbyte;The number of bits per byte of the target cpu. Usually 8.
int bytespertaddr;The number of bytes per target address. Note, that this really
defines the size of a backend’s address pointer and might differ
from the actual size of taddr (see above).
mnemonic mnemonics[];The mnemonic table keeps a list of mnemonic names and operand types the
assembler will match against using parse_operand(). It may also
include a target specific mnemonic_extension.
const char *cpu_copyright;A string that will be emitted as part of the copyright message.
const char *cpuname;A string describing the target cpu.
int init_cpu();Will be called during startup, after argument parsing. Must return zero if initializations failed, non-zero otherwise.
int cpu_args(char *);This function will be called with the command line arguments (unless they were already recognized by other modules). If an argument was recognized, return non-zero.
char *parse_cpu_special(char *);This function will be called with a source line as argument and allows
the cpu module to handle cpu-specific directives etc. Functions like
eol() and skip() from the syntax-module should be used to
keep the syntax consistent.
operand *new_operand();Allocate and initialize a new operand structure.
int parse_operand(char *text,int len,operand *op,int requires);Parses the source at text with length len to fill the target
specific operand structure pointed to by op. Return with one
of the following codes:
PO_NOMATCHThe source did no match the operand type given in requires.
PO_CORRUPTThe source was definitely identified as garbage, making it useless to try matching it against any other operand types.
PO_MATCHThe parsed source matches the operand type in requires. As soon
as all the instruction’s operands have been matched, the instruction
is successfully recognized.
PO_SKIPWorks like PO_MATCH, but skips the next operand from the mnemonic
table. For example, because it was already handled together with the
current operand.
PO_COMB_OPTWorks like PO_MATCH, but requests parsing of the next argument from
the source text, if any, with a pointer to the same operand structure
as before. This makes it possible to merge multiple operands into a
single operand structure.
PO_COMB_REQLike PO_COMB_OPT, requests parsing of the next argument
with a pointer to the same operand structure. But this time the
additional argument is mandatory.
PO_NEXTSource did not match the given operand type in requires. Request
parsing the same chunk of source text again, but using the following
operand type. Can be used to break a PO_COMB_OPT or PO_COMB_REQ
attempt and continue normally.
size_t instruction_size(instruction *ip, section *sec, taddr pc);Returns the size of the instruction ip in bytes, which, in the
final pass, must be identical to the number of bytes written by
eval_instruction()
(see below).
dblock *eval_instruction(instruction *ip, section *sec, taddr pc);Converts the instruction ip into a DATA atom, including relocations,
if necessary.
dblock *eval_data(operand *op, taddr bitsize, section *sec, taddr pc);Converts a data operand into a DATA atom, including relocations.
void init_instruction_ext(instruction_ext *);(If HAVE_INSTRUCTION_EXTENSION is set.)
Initialize an instruction extension.
char *parse_instruction(char *,int *,char **,int *,int *);(If MAX_QUALIFIERS is greater than 0.)
Parses instruction and saves extension locations.
int set_default_qualifiers(char **,int *);(If MAX_QUALIFIERS is greater than 0.)
Saves pointers and lengths of default qualifiers for the selected CPU and
returns the number of default qualifiers. Example: for a M680x0 CPU this
would be a single qualifier, called "w". Used by execute_macro().
cpu_opts_init(section *);(If HAVE_CPU_OPTS is set.)
Gives the cpu module the chance to write out OPTS atoms with
initial settings before the first atom for a section is generated.
cpu_opts(void *);(If HAVE_CPU_OPTS is set.)
Apply option modifications from an OPTS atom. For example:
change cpu type or optimization flags.
print_cpu_opts(FILE *,void *);(If HAVE_CPU_OPTS is set.)
Called from print_atom() to print an OPTS atom’s contents.
Output modules can be chosen at runtime rather than compile time. Therefore, several output modules are linked into one vasm executable and their structure differs somewhat from syntax and cpu modules.
Usually, an output module for some object format fmt should be contained
in a file output/<fmt>.c (it may use/include other files if necessary).
To automatically include this format in the build process, the
OUTFMTS definition in make.rules has to be extended.
The module should be added to the OBJS variable
at the start of make.rules. Also, a dependency line should be added
(see the existing output modules).
An output module must only export a single function which will return pointers to necessary data/functions. This function should have the following prototype:
int init_output_<fmt>(
char **copyright,
void (**write_object)(FILE *,section *,symbol *),
int (**output_args)(char *)
);
In case of an error, zero must be returned. Otherwise, It should perform all necessary initializations, return non-zero and return the following output parameters via the pointers passed as arguments:
copyrightA pointer to the copyright string.
write_objectA pointer to a function emitting the output. It will be called after the assembler has completed and will receive pointers to the output file, to the first section of the section list and to the first symbol in the symbol list. See the section on general data structures for further details.
output_argsA pointer to a function checking arguments. It will be called with all command line arguments (unless already handled by other modules). If the output module recognizes an appropriate option, it has to handle it and return non-zero. If it is not an option relevant to this output module, zero must be returned.
At last, a call to the init_output_<fmt>() has to be added in the
init_output() function in vasm.c (should be self-explanatory).
Besides assigning the above mentioned function pointers, this function
can be used to redefine the assembler’s behaviour.
For example you may set the following global variables:
unnamed_sections = 1;Set when the output module cannot handle section names. Usually such an output module only knows section types: text, data, bss.
secname_attr = 1;Set when the section attributes are used to differentiate between two sections with the same name.
Some remarks:
#ifdef VASM_CPU_MYCPU ... #endif or similar.
Also, if the selected CPU is not supported, the init function should fail.
output_error function.
As all output modules are linked together, they have a common list of error
messages in the file output_errors.h. If a new message is needed, this
file has to be extended (see the section on general data structures for
details).
When the cause for an error relates to an atom you may also use
the output_atom_error function instead, which additionally prints
the atom’s source text line.
In output_errors.h use the NOLINE flag when no atom is
available.
vasm has a mechanism to specify rather complex relocations in a
standard way (see the section on general data structures). They can be
extended with CPU specific relocations, but usually CPU modules will
try to create standard relocations (sometimes several standard relocations
can be used to implement a CPU specific relocation). An output
module should try to find appropriate relocations supported by the
object format. The goal is to avoid special CPU specific
relocations as much as possible.
Volker Barthelmann vb@compilers.de