execute function from RAM on XMC1100

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
User16189
Level 1
Level 1
Hi,

My goal is to evaluate how fast I can toggle an I/O on the XMC1100.
For this, I bought the cute XMC 2Go kit and installed DAVE4.
Starting point was the XMC_2Go_Initial_Start_v1.3 example.
I can change the frequency of the blinking led. Toolchain works. Fine.


Then I added the line
P0_5_toggle();
In a while(1) loop.

Toggling works and time from rise to fall is about 3us.
Really slow.

According the comments in the example, the CPU clock is running on 8 Mhz.
Not sure how this works, but I need the 32 Mhz.
Changed the configuration to :
SCU_CLK->CLKCR = 0x0FFC0100UL;

This resulted in toggle time of about 1.2 us.

After this, I replaced the function call to P0_5_toggle() with it's contents:
PORT0->OMR = 0x00200020UL;

This resulted in an improvement to about 530ns.


Oops! The tooling neglects the inline directive of the P0_5_toggle() function. No idea why.

Next step is to execute the toggle code from RAM.

Therefore, I moved the toggling code to a separate function in a separate file (header + c file).
In the function declaration in the header file, I added the famous __attribute__((section(".ram_code")))
However, the tooling also neglects this directive and the code is still executed from flash.

Anybody knows a solution?
It seems that it is a tooling issue.
I tried to understand the linker script, but I did not see strange things.

Thanks,

Lodewijk
--

An investigation of the 530 ns:
The P0_5_toggle() generates 3 assembly instructions (2 loads and 1 store):
LDR: 2 cycles
LDR: 2 cycles
STR: 2 cycles
+ a B(ranch) for the while loop, good for 3 cycles

So, this is 9 cycles. If we assume 2 wait cycles for reading from flash, we have 8 additional cycles.
In total 17 cycles. 17 cycles * 31ns = 527ns.
0 Likes
2 Replies
jferreira
Employee
Employee
10 sign-ins 5 sign-ins First like received
Hi,

The inline will be ignored if the compiler optimization level it is left at its default level -O0. You can either use __STATIC_FORCEINLINE or use at least -O1.
3355.attach
See below code snippet. As you can see I have placed the main also in RAM since we are inlining the P0_5_toggle() function.

#include 

__STATIC_INLINE __attribute__ ((section (".ram_code"))) void P0_5_toggle(void);

void P0_5_toggle(void)
{
XMC_GPIO_ToggleOutput(P0_5);
}

__attribute__ ((section (".ram_code"))) int main(void)
{

XMC_GPIO_SetMode(P0_5, XMC_GPIO_MODE_OUTPUT_PUSH_PULL);

/* Placeholder for user application code. The while loop below can be replaced with user application code. */
while(1U)
{
P0_5_toggle();
}
}


The assembler generated using -O1 is (you can also experiment with other compiler optimizations)
20000520 
:
{
XMC_GPIO_ToggleOutput(P0_5);
}

__attribute__ ((section (".ram_code"))) int main(void)
{
20000520: b508 push {r3, lr}

XMC_GPIO_SetMode(P0_5, XMC_GPIO_MODE_OUTPUT_PUSH_PULL);
20000522: 4804 ldr r0, [pc, #16] ; (20000534 <__data_end+0x14>)
20000524: 2105 movs r1, #5
20000526: 2280 movs r2, #128 ; 0x80
20000528: f000 f80a bl 20000540 <__XMC_GPIO_SetMode_veneer>

__STATIC_INLINE void XMC_GPIO_ToggleOutput(XMC_GPIO_PORT_t *const port, const uint8_t pin)
{
XMC_ASSERT("XMC_GPIO_ToggleOutput: Invalid port", XMC_GPIO_CHECK_OUTPUT_PORT(port));

port->OMR = 0x10001U << pin;
2000052c: 4a01 ldr r2, [pc, #4] ; (20000534 <__data_end+0x14>)
2000052e: 4b02 ldr r3, [pc, #8] ; (20000538 <__data_end+0x18>)
20000530: 6053 str r3, [r2, #4]
20000532: e7fd b.n 20000530

20000534: 40040000 .word 0x40040000
20000538: 00200020 .word 0x00200020
2000053c: 00000000 .word 0x00000000
0 Likes
User16189
Level 1
Level 1
Thanks for the tip that compiler optimizations should be enabled to use compiler directives.

Consecutive toggling of the same pin requires the execution of one STR operation only, and is now possible with a pulse width of about 62 ns (2 clock cycles).
0 Likes