Sep 06, 2019
01:33 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sep 06, 2019
01:33 AM
I test the performance difference between int and float through "dotp"(res += a*b).
I found the speed of int is slower than float, about half.
Why?I want to know.
I found the speed of int is slower than float, about half.
Why?I want to know.
- Tags:
- IFX
6 Replies
Sep 06, 2019
03:11 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sep 06, 2019
03:11 AM
Hi Shaquille,
Can you list the sample code for the integer and float sample case? Just so that we get a better understanding of what you're doing.
Best regards,
Henk-Piet Glas
Principal Technical Specialist
Embedded Software
Sep 06, 2019
04:17 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sep 06, 2019
04:17 AM
float test:
IfxCpu_resetAndStartCounters(IfxCpu_CounterMode_normal) ;
for(i = 0 ; i < iTryCnt ; i ++)
pDotpResF = Test_float_dotp(pAF, pBF, iDataCnt) ;
perfCounts = IfxCpu_stopCounters() ;
int test:
IfxCpu_resetAndStartCounters(IfxCpu_CounterMode_normal) ;
for(i = 0 ; i < iTryCnt ; i ++)
pDotpResI = Test_int_divp(pAI, pBI, iDataCnt) ;
perfCounts = IfxCpu_stopCounters() ;
implementation of Test_float_dotp:
float Test_float_dotp(const float* pVecA, const float* pVecB, int iDataCnt)
{
int i = 0 ;
float fResult = 0.0f ;
for(i = 0 ; i < iDataCnt ; i ++)
fResult += pVecA * pVecB ;
return fResult ;
}
implementation of Test_int_dotp
int Test_int_dotp(const int* pVecA, const int* pVecB, int iDataCnt)
{
int i = 0 ;
int iResult = 0 ;
for(i = 0 ; i < iDataCnt ; i ++)
iResult += pVecA * pVecB ;
return iResult ;
}
result is:
float dotp cost(64*64): 17689tick
int dotp cost(64*64): 34008tick
Is there something wrong in above code?
IfxCpu_resetAndStartCounters(IfxCpu_CounterMode_normal) ;
for(i = 0 ; i < iTryCnt ; i ++)
pDotpResF = Test_float_dotp(pAF, pBF, iDataCnt) ;
perfCounts = IfxCpu_stopCounters() ;
int test:
IfxCpu_resetAndStartCounters(IfxCpu_CounterMode_normal) ;
for(i = 0 ; i < iTryCnt ; i ++)
pDotpResI = Test_int_divp(pAI, pBI, iDataCnt) ;
perfCounts = IfxCpu_stopCounters() ;
implementation of Test_float_dotp:
float Test_float_dotp(const float* pVecA, const float* pVecB, int iDataCnt)
{
int i = 0 ;
float fResult = 0.0f ;
for(i = 0 ; i < iDataCnt ; i ++)
fResult += pVecA * pVecB ;
return fResult ;
}
implementation of Test_int_dotp
int Test_int_dotp(const int* pVecA, const int* pVecB, int iDataCnt)
{
int i = 0 ;
int iResult = 0 ;
for(i = 0 ; i < iDataCnt ; i ++)
iResult += pVecA * pVecB ;
return iResult ;
}
result is:
float dotp cost(64*64): 17689tick
int dotp cost(64*64): 34008tick
Is there something wrong in above code?
Attachments are accessible only for community members.
Sep 08, 2019
02:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sep 08, 2019
02:06 AM
Hi Shaquille,
Code looks OK to me. I used the attached sample case, using parts of your code. To build it I used:
tricore-gcc -O2 -mcpu=tc39xx main.c dotint.c dotflt.c -o benchmark.elf
I then simulated it as below:
tsim16p_e -MConfig MConfig -disable-watchdog -H -g -s -x 0 -o benchmark.elf
Which then generates the following output:
Integers: 0x000000E9 cycles
Floats: 0x000000E8 cycles
So floats are still faster, as in your case, but only by 1 cycle. What command line options have you been using?
Best regards,
Henk-Piet Glas
Principal Technical Specialist
Embedded Software
Sep 13, 2019
06:26 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sep 13, 2019
06:26 PM
sorry, I make a mistake again, divp instead of dotp for int-test:(
Sep 14, 2019
04:44 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sep 14, 2019
04:44 AM
Hi Shaquille,
No problem. If there's anything else I can help you with, just let me know.
Best regards,
Henk-Piet Glas
Principal Technical Specialist
Embedded Software
Dec 06, 2019
01:22 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dec 06, 2019
01:22 AM
Hello,
I just posted a new topic with the same thematic.
Could you join it here?
Also is there some drawback in using floats for precision calculation instead of int?
As the performance is almost the same, probably by using directly float instead of some integer with LSB and offset, probably even more time could be spared.
Does the TC277 have a separate FPU that operates separately from the integer unit?
Can they operate in parallel also?
I just posted a new topic with the same thematic.
Could you join it here?
Also is there some drawback in using floats for precision calculation instead of int?
As the performance is almost the same, probably by using directly float instead of some integer with LSB and offset, probably even more time could be spared.
Does the TC277 have a separate FPU that operates separately from the integer unit?
Can they operate in parallel also?