Contentions between CPU vs SRI accesses to PSPR / DSPR

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
User13594
Level 1
Level 1
Hi,

After skimming carefully through the datasheet, application notes, and, more generally, the Web... I haven't been able to find any information about the way the AURIX manages concurrent accesses to the scratchpaads (PSPR or DSPR) when one access comes from the SRI and the other one comes from the core (during a fetch operation, for instance).

Are there any contentions (which would mean temporal interferences between the two accesses) ?

In practice, my problem is to estimate the temporal impact of DMA accesses to the scratchpads while the core accesses them (PSPR or DSPR).

Thanks for any help, pointers, etc.

Best regards,

Eric
0 Likes
4 Replies
NeMa_4793301
Level 6
Level 6
10 likes received 10 solutions authored 5 solutions authored
Hi Eric. A write request from the SRI can indeed stall a CPU that is accessing its own memory.

Estimating the impact is exquisitely difficult, because you have to factor in the exact sequence of CPU instructions, the CPU pipelines, SRI arbitration, etc., etc.

As a crude example: if CPU1 is doing atomic test-and-set instructions on CPU0 DSPR (e.g., SWAPMSK), CPU0 can be slowed down by up to 90%. But that's a very artificial use case where both CPUs are doing nothing else. On the TC3xx, simply moving the semaphore from DSPR0 into the slightly decoupled dLMU0 reduces the impact to CPU0 from 90% to 10%.
0 Likes
User13594
Level 1
Level 1
UC_wrangler wrote:
Hi Eric. A write request from the SRI can indeed stall a CPU that is accessing its own memory.

Estimating the impact is exquisitely difficult, because you have to factor in the exact sequence of CPU instructions, the CPU pipelines, SRI arbitration, etc., etc.

As a crude example: if CPU1 is doing atomic test-and-set instructions on CPU0 DSPR (e.g., SWAPMSK), CPU0 can be slowed down by up to 90%. But that's a very artificial use case where both CPUs are doing nothing else. On the TC3xx, simply moving the semaphore from DSPR0 into the slightly decoupled dLMU0 reduces the impact to CPU0 from 90% to 10%.


Dear "UC_wrangler",

Thanks a lot for your answer.

We were actually expecting some difficulties, but our actual context is much more restrictive than the general case: we only deal with the following concurrent accesses:
- fetch from PSPR (by the core)
vs.
- write to PSPR (by the DMA controller)

This excludes, e.g., test-and-set from other cores.

Our first experiment shows that in this specific context, interferences are "quite small", but this empirical estimation is not a demonstrattion, and it may not be conservative (are we missing any pathological case?).

Taking into account the current state of the pipeline to estimate the contentions is something we could do, but what is definitely missing is an "idea" of the concurrent access management policy to the PSPR. Even a conservative abstraction of this mechanism would be useful...

Things are very well documented for multiple accesses to the SRI, but there is a real mystery for multiple accesses to the PSPR/DSPR.

Anyway,

Thanks again for your help.

Regards,
Eric
0 Likes
NeMa_4793301
Level 6
Level 6
10 likes received 10 solutions authored 5 solutions authored
From previous experiments, I suspect access to local memories can flicker between CPU access and remote (SRI) access every other cycle (presuming the CPU and SRI are at the same speed).

Other things to keep in mind for your use case:
- Your DMA transfer width should also be a multiple of 64 bits (e.g., CHCFGR.CHDW=3/4/5), to take advantage of the SRI 64-bit bus width, and perhaps BTR2 / BTR4 transfers
- The PMI and Instruction Fetch Unit are 64 bits wide, so the exact sequence of instructions, CPU pipelines, and prefetch also need to be considered
- My guess is that BTR2 (128-bit, or CHDW=4) might be the optimal DMA solution for this case, but if your PSPR code is accessing anything over the SRI bus, that will fight with the DMA for SRI arbitration.
0 Likes
User13594
Level 1
Level 1
Dear "UC_Wrangler",

Thanks a lot. Your answers were helpful.

We'll continue investigating this "issue" and will report our "conclusions" here.
Best regards,
Eric
0 Likes