Keil C compiler calculation efficiency for XC2000

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
Caar
Employee
Employee
Welcome! First question asked First reply posted
Hello,

I have a question about the efficiency of the Keil C compiler (V7.00e) when compiling for maximum speed of multiplication, division and addition operations. I need to ensure the compiled code for these mathematical operations is efficient and fast.

Does anyone know if there is documentation available which explains how to optimize the Keil compiler for fast multiply and divide operations? Are there any examples of how Keil compiles diffferent mathematical operations for the XC2000 and the correct syntax to use?

An example of the type of calculation required is given below:

(16bit result) = ((16bit input) * 123) / 4567

Can Keil take advantage of the MAC unit in the XC2000 core to perform the 16bit x 16bit multiplication above?


Thanks,
Chris
0 Likes
2 Replies
Markus_Kroh
Employee
Employee
Dear Chris66,

Please be informed that it may take some time till you will get the final answer, because the responsible person is not available today.

Will give you an update on Monday the latest

Thanks for your understanding.

Markus
0 Likes
Markus_Kroh
Employee
Employee
Dear Chris,

The C166 uses the complete XC2000/XE16x instruction set to achieve best code density and fastest execution time. You can see this in the attached example project. The result of this example is:
All basic 16 bit (integer) operations such as multiplication, division, addition and subtraction are done with inline MUL, DIV, Add and SUB instructions without library calls. This provides best performance.
32 bit (long) multiplications and divisions are done with library calls because the CPU does not provide suitable instructions.
32 bit (long) additions and subtractions are done with inline ADD/SUB instructions without library calls.

However, there are some expressions that usually require library calls but due to special optimizations for the C166 instruction set, inline code is generated. These expressions are:

(32 bit result) = (16 bit input1) * (16 bit input2) Creates one MULU instruction

(16 bit result) = (32 bit input1) / (16 bit input2) Creates one DIVLU

The example expression is translated to this code:
(16bit result) = ((16bit input) * 123) / 4567

0000 F2FB1C00 R MOV R11,uiX
0004 E6F4EA00 MOV R4,#07BH
0008 1BB4 MULU R11,R4
000A E6F4D711 MOV R4,#011D7H
000E 5B44 DIVU R4
0010 F6071000 R MOV uiZ,MDL

This creates 20 Bytes of code.

The MAC unit can also be enabled by using #pragma MAC. However, the MAC instructions have limited benefits for standard C expressions. You can get best performance for the MAC unit by using inline assembler or intrinsic functions. Please see the following application notes and knowledgebase articles for more details.
http://www.keil.com/support/docs/687.htm
http://www.keil.com/appnotes/docs/apnt_140.asp
http://www.keil.com/appnotes/docs/apnt_172.asp
http://www.keil.com/support/man/docs/c166/c166_mac.htm

Best regards

Markus
0 Likes