Apr 30, 2021
01:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Apr 30, 2021
01:59 AM
8 Replies
Apr 30, 2021
05:23 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Apr 30, 2021
05:23 AM
I asked the same question to Tasking a while ago and got this response:
- Details about the user stack usage: The call tree which can be included in the map file does include details about the user stack usage of the function and also its callees. For more details you can have a look at chapter:
15.2. Linker Map File Format
of the TriCore tools v6.3r1 user guide.
Since v6.3r1 a new feature has been introduced which allows to specify root functions for call stack calculations. For more details you can have a look at chapter
17.4.3. Defining Address Spaces
section -> Stacks and heaps
The application note:
STACKS AND STACK SIZE ESTIMATION IN THE TASKING VX-TOOLSET FOR TRICORE
https://resources.tasking.com/tasking-whitepapers/stacks-and-stack-size-estimation-in-the-tasking-vx...
also includes details about the stack usage and calculation.
May 01, 2021
07:53 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
May 01, 2021
07:53 AM
Wrangler, I do not have the option of making changes to LSL file and moreover I use the Lauterbatch Trace32 debugger and Tricore Toolset 6.2r2, so the process mentioned in the above link is not possible I feel. Can we not use the A10 general purpose register ? Or any other methodology ?
Aim is to find the stack usage of a particular API in a full code stack.
Aim is to find the stack usage of a particular API in a full code stack.
May 01, 2021
09:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
May 01, 2021
09:12 AM
If you can't modify the LSL, and you're stuck with Tasking 6.2r2, then you're going to have to do it the old-fashioned way: set a breakpoint at the top level of the call tree, record A10, set a breakpoint in your deepest API, and record A10 again. The difference is the stack depth.
May 02, 2021
11:51 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
May 02, 2021
11:51 AM
So let for example, in the following call tree -
+-- E2E_P01Check [ustack_tc0:8,16]
| | | |
| | | +-- E2E_P01.src:E2E_P01CalculateCRC *
| | | |
| | | +-- E2E_P01.src:E2E_P01CheckStatus [ustack_tc0:0,0]
1. If I have to calculate for the P01Check, as per your explanation, initial break point should be placed at P01Check and Final Breakpoint at P01CheckStatus (Considering that was the last API call inside P01Check) or should the Final breakpoint be placed after coming out of P01Check ?
2. Also, does change of board from TC297 to TC375 make difference in stack usage ? [ If code used and compiler flags are same. Register settings might be different ]
3. I wanted to automate the process of calculating stack usage. Is there any way to do it ? I have done it for calculating the timing but couldn't get any solution for stack usage.
+-- E2E_P01Check [ustack_tc0:8,16]
| | | |
| | | +-- E2E_P01.src:E2E_P01CalculateCRC *
| | | |
| | | +-- E2E_P01.src:E2E_P01CheckStatus [ustack_tc0:0,0]
1. If I have to calculate for the P01Check, as per your explanation, initial break point should be placed at P01Check and Final Breakpoint at P01CheckStatus (Considering that was the last API call inside P01Check) or should the Final breakpoint be placed after coming out of P01Check ?
2. Also, does change of board from TC297 to TC375 make difference in stack usage ? [ If code used and compiler flags are same. Register settings might be different ]
3. I wanted to automate the process of calculating stack usage. Is there any way to do it ? I have done it for calculating the timing but couldn't get any solution for stack usage.
May 02, 2021
06:08 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
May 02, 2021
06:08 PM
#1: Step 3 instructions into the function so that it reserves its local stack space.
#2: The instructions between TC2xx and TC3xx will not be significantly different.
#3: It depends on how adept you are with debugger scripts. Lauterbach, PLS, and iSYSTEM are quite flexible. My general recommendation is to fill the task stack with a known pattern, and then it's easy to spot the high water mark after letting your application run for a few seconds.
#2: The instructions between TC2xx and TC3xx will not be significantly different.
#3: It depends on how adept you are with debugger scripts. Lauterbach, PLS, and iSYSTEM are quite flexible. My general recommendation is to fill the task stack with a known pattern, and then it's easy to spot the high water mark after letting your application run for a few seconds.
May 03, 2021
02:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
May 03, 2021
02:48 AM
UC_wrangler wrote:
#1: Step 3 instructions into the function so that it reserves its local stack space.
#2: The instructions between TC2xx and TC3xx will not be significantly different.
#3: It depends on how adept you are with debugger scripts. Lauterbach, PLS, and iSYSTEM are quite flexible. My general recommendation is to fill the task stack with a known pattern, and then it's easy to spot the high water mark after letting your application run for a few seconds.
Wrangler, I am still not clear with points #1 and #3.
May 03, 2021
10:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
May 03, 2021
10:04 AM
#1: If I have a function like this:
Then the first line of code in the function allocates 16384 bytes on the stack with this instruction, decrementing A10 by 16384:
If you simply set a breakpoint at the start of the function, you won't see that change in A10. So, step a couple instructions to be sure.
Note 16384 bytes (4096 * 4 bytes) only includes the variable x; i is not included, because the compiler has optimized it into a register instead of memory.
#3: If you fill the stack area (e.g., ustack_tc0) with a known value, let the application run for a few seconds, and then view the stack area and see how much of the original pattern is intact, that may give a good indication of maximum stack depth. It may not be accurate if your application hasn't executed all of its paths.
void something( void )
{
int x[4096];
int i;
for( i=0; i<0400; i++ )
x = 0;
something2(x);
}
Then the first line of code in the function allocates 16384 bytes on the stack with this instruction, decrementing A10 by 16384:
lea a10,[a10]-16384
If you simply set a breakpoint at the start of the function, you won't see that change in A10. So, step a couple instructions to be sure.
Note 16384 bytes (4096 * 4 bytes) only includes the variable x; i is not included, because the compiler has optimized it into a register instead of memory.
#3: If you fill the stack area (e.g., ustack_tc0) with a known value, let the application run for a few seconds, and then view the stack area and see how much of the original pattern is intact, that may give a good indication of maximum stack depth. It may not be accurate if your application hasn't executed all of its paths.
May 13, 2021
06:01 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
May 13, 2021
06:01 PM
Hi SRS_Sabat,
There is a alternate solution for the max stack usage measurement, the method should be:
1. at startup phase, fill all your stack with a special pattern such as: 0xA5A5A5A5, this may take a longer time than your nominal startup time.
2. after a long time(at least executed a full and most complex function of your sw), check the first data where not matching the special pattern
3. calculate your max stack usage: address.first_non_special_pattern - address.stack_start
I`m not sure if this can help you, but this method can be used as an rough method for your intention.
There is a alternate solution for the max stack usage measurement, the method should be:
1. at startup phase, fill all your stack with a special pattern such as: 0xA5A5A5A5, this may take a longer time than your nominal startup time.
2. after a long time(at least executed a full and most complex function of your sw), check the first data where not matching the special pattern
3. calculate your max stack usage: address.first_non_special_pattern - address.stack_start
I`m not sure if this can help you, but this method can be used as an rough method for your intention.