Fork me on GitHub

4. More Assembler Directives

In this section, we will describe some commonly used assembler directives, using two example programs.

  1. A program to sum an array
  2. A program to calculate the length of a string

4.1. Sum an Array

The following code sums an array of bytes and stores the result in r3.

Listing 2. Sum an Array

        .text
entry:  b start                 @ Skip over the data
arr:    .byte 10, 20, 25        @ Read-only array of bytes
eoa:                            @ Address of end of array + 1

        .align
start:
        ldr   r0, =eoa          @ r0 = &eoa
        ldr   r1, =arr          @ r1 = &arr
        mov   r3, #0            @ r3 = 0
loop:   ldrb  r2, [r1], #1      @ r2 = *r1++
        add   r3, r2, r3        @ r3 += r2
        cmp   r1, r0            @ if (r1 != r2)
        bne   loop              @    goto loop
stop:   b stop

The code introduces two new assembler directives — .byte and .align. These assembler directives are described below.

4.1.1. .byte Directive

The byte sized arguments of .byte are assembled into consecutive bytes in memory. There are similar directives .2byte and .4byte for storing 16 bit values and 32 bit values, respectively. The general syntax is given below.

.byte   exp1, exp2, ...
.2byte  exp1, exp2, ...
.4byte  exp1, exp2, ...

The arguments could be simple integer literal, represented as binary (prefixed by 0b or 0B), octal (prefixed by 0), decimal or hexadecimal (prefixed by 0x or 0X). The integers could also be represented as character constants (character surrounded by single quotes), in which case the ASCII value of the character will be used.

The arguments could also be C expressions constructed out of literals and other symbols. Examples are shown below.

pattern:  .byte 0b01010101, 0b00110011, 0b00001111
npattern: .byte npattern - pattern
halpha:   .byte 'A', 'B', 'C', 'D', 'E', 'F'
dummy:    .4byte 0xDEADBEEF
nalpha:   .byte 'Z' - 'A' + 1

4.1.2. .align Directive

ARM requires that the instructions be present in 32-bit aligned memory locations. The address of the first byte, of the 4 bytes in an instruction, should be a multiple of 4. To adhere to this, the .align directive can be used to insert padding bytes till the next byte address will be a multiple of 4. This is required only when data bytes or half words are inserted within code.

4.2. String Length

The following code calculates the length of string and stores the length in register r1.

Listing 3. String Length

        .text
        b start

str:    .asciz "Hello World"

        .equ   nul, 0

        .align
start:  ldr   r0, =str          @ r0 = &str
        mov   r1, #0

loop:   ldrb  r2, [r0], #1      @ r2 = *(r0++)
        add   r1, r1, #1        @ r1 += 1
        cmp   r2, #nul          @ if (r1 != nul)
        bne   loop              @    goto loop

        sub   r1, r1, #1        @ r1 -= 1
stop:   b stop

The code introduces two new assembler directives - .asciz and .equ. The assembler directives are described below.

4.2.1. .asciz Directive

The .asciz directive accepts string literals as arguments. String literal are a sequence characters in double quotes. The string literals are assembled into consecutive memory locations. The assembler automatically inserts a nul character (\0 character) after each string.

The .ascii directive is same as .asciz, but the assembler does not insert a nul character after each string.

4.2.2. .equ Directive

The assembler maintains something called a symbol table. The symbol table maps label names to addresses. Whenever the assembler encounters a label definition, the assembler makes an entry in the symbol table. And whenever the assembler encounters a label reference, it replaces the label by the corresponding address from the symbol table.

Using the assembler directive .equ, it is also possible to manually insert entries in the symbol table, to map names to values, which are not necessarily addresses. Whenever the assembler encounters these names, it replaces them by their corresponding values. These names and label names are together called symbol names.

The general syntax of the directive is given below.

.equ name, expression

The name is a symbol name, and has the same restrictions as that of the label name. The expression could be simple literal, or an expression as explained for the .byte directive.

[Note] Note

Unlike the .byte directive, the .equ directive itself does not allocate any memory. They just create entries in the symbol table.