It is not possible to directly execute C code, when the processor comes out of reset. Since, unlike assembly language, C programs need some basic pre-requisites to be satisfied. This section will describe the pre-requisites and how to meet the pre-requisites.
We will take the example of C program that calculates the sum of an array as an example. And by the end of this section, we will be able to perform the necessary setup, transfer control to the C code and execute it.
Listing 12. Sum of Array in C
static int arr[] = { 1, 10, 4, 5, 6, 7 };
static int sum;
static const int n = sizeof(arr) / sizeof(arr[0]);
int main()
{
int i;
for (i = 0; i < n; i++)
sum += arr[i];
}
Before transferring control to C code, the following have to be setup correctly.
Global variables
C uses the stack for storing local (auto) variables, passing function arguments, storing return address, etc. So it is essential that the stack be setup correctly, before transferring control to C code.
Stacks are highly flexible in the ARM architecture, since the implementation is completely left to the software. For people not familiar with the ARM architecture a overview is provided in Appendix C, ARM Stacks.
To make sure that code generated by different compilers is
interroperable, ARM has created the ARM Architecture Procedure Call Standard (AAPCS).
The register to be used as the stack pointer and the direction in
which the stack grows is all dictated by the AAPCS. According to
the AAPCS, register r13
is to be used as the stack
pointer. Also the stack should be full-descending.
One way of placing global variables and the stack is shown in the following diagram.
So all that has to be done in the startup code is to point
r13
at the highest RAM address, so
that the stack can grow downwards (towards lower addresses). For
the connex
board this can be acheived
using the following ARM instruction.
ldr sp, =0xA4000000
Note that the the assembler provides an alias sp
for the r13
register.
Note | |
---|---|
The address |
When C code is compiled, the compiler places initialized global
variables in the .data
section. So
just as with the assembly, the .data
has to be copied from Flash to RAM.
The C language guarantees that all uninitialized global
variables will be initialized to zero. When C programs are
compiled, a separate section called .bss
is used for uninitialized variables. Since
the value of these variables are all zeroes to start with, they do
not have to be stored in Flash. Before transferring control to C
code, the memory locations corresponding to these variables have to
be initialized to zero.
GCC places global variables marked as const
in a separate section, called .rodata
. The .rodata
is also used for storing string constants.
Since contents of .rodata
section
will not be modified, they can be placed in Flash. The linker
script has to modified to accomodate this.
Now that we know the pre-requisites we can create the linker script and the startup code. The linker script Listing 10, “Linker Script with Section Copy Symbols” is modified to accomodate the following.
.bss
section
placementvectors
section
placement.rodata
section
placementThe .bss
is placed right after
.data
section in RAM. Symbols to
locate the start of .bss
and end of
.bss
are also created in the linker
script. The .rodata
is placed right
after .text
section in Flash. The
following diagram shows the placement of the various sections.
Listing 13. Linker Script for C code
SECTIONS {
. = 0x00000000;
.text : {
* (vectors);
* (.text);
}
.rodata : {
* (.rodata);
}
flash_sdata = .;
. = 0xA0000000;
ram_sdata = .;
.data : AT (flash_sdata) {
* (.data);
}
ram_edata = .;
data_size = ram_edata - ram_sdata;
sbss = .;
.bss : {
* (.bss);
}
ebss = .;
bss_size = ebss - sbss;
}
The startup code has the following parts
.data
from Flash to RAM.bss
Listing 14. C Startup Assembly
.section "vectors"
reset: b start
undef: b undef
swi: b swi
pabt: b pabt
dabt: b dabt
nop
irq: b irq
fiq: b fiq
.text
start:
@@ Copy data to RAM.
ldr r0, =flash_sdata
ldr r1, =ram_sdata
ldr r2, =data_size
@@ Handle data_size == 0
cmp r2, #0
beq init_bss
copy:
ldrb r4, [r0], #1
strb r4, [r1], #1
subs r2, r2, #1
bne copy
init_bss:
@@ Initialize .bss
ldr r0, =sbss
ldr r1, =ebss
ldr r2, =bss_size
@@ Handle bss_size == 0
cmp r2, #0
beq init_stack
mov r4, #0
zero:
strb r4, [r0], #1
subs r2, r2, #1
bne zero
init_stack:
@@ Initialize the stack pointer
ldr sp, =0xA4000000
bl main
stop: b stop
To compile the code, it is not necessary to invoke the
assembler, compiler and linker individually. gcc
is intelligent enough to do that for us.
As promised before, we will compile and execute the C code shown in Listing 12, “Sum of Array in C”.
$ arm-none-eabi-gcc -nostdlib -o csum.elf -T csum.lds csum.c startup.s
The -nostdlib
option is used to
specify that the standard C library should not be linked in. A
little extra care has to be taken when the C library is linked in.
This is discussed in Section 11, “Using the C
Library”.
A dump of the symbol table will give a better picture of how things have been placed in memory.
$ arm-none-eabi-nm -n csum.elf
00000000 t reset ❶
00000004 A bss_size
00000004 t undef
00000008 t swi
0000000c t pabt
00000010 t dabt
00000018 A data_size
00000018 t irq
0000001c t fiq
00000020 T main
00000090 t start ❷
000000a0 t copy
000000b0 t init_bss
000000c4 t zero
000000d0 t init_stack
000000d8 t stop
000000f4 r n ❸
000000f8 A flash_sdata
a0000000 d arr ❹
a0000000 A ram_sdata
a0000018 A ram_edata
a0000018 A sbss
a0000018 b sum ❺
a000001c A ebss
reset
and the rest of the exception vectors are placed starting from
0x0 . |
|
The assembly code is placed right
after the 8 exception vectors (8 * 4 = 32 =
0x20 ). |
|
The read-only data n , is placed in Flash after the code. |
|
The initialized data arr , an array of 6 integers, is placed at the
start of RAM 0xA0000000 . |
|
The uninitialized data sum is placed after the array of 6 integers.
(6 * 4 = 24 = 0x18 ) |
To execute the program, convert the program to .bin
format, execute in Qemu, and dump the
sum
variable located at 0xA0000018
.
$ arm-none-eabi-objcopy -O binary csum.elf csum.bin
$ dd if=csum.bin of=flash.bin bs=4096 conv=notrunc
$ qemu-system-arm -M connex -pflash flash.bin -nographic -serial /dev/null
(qemu) xp /6dw 0xa0000000
a0000000: 1 10 4 5
a0000010: 6 7
(qemu) xp /1dw 0xa0000018
a0000018: 33