EtherCAT + ModbusTCP

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
Not applicable
Hi,

I'm working on an XMC4800 project, that requires EtherCAT comms and ModbusTCP comms.
I've got both working individually, based on the respective example DAVE projects. The project
specification, is to perform reads with each at 1kHz, or as near to that as can be achieved.

For each, comms is running nice and fast. The EtherCAT project is reading at 1kHz. For the ModbusTCP project,
I can use the following modpoll command line, reading 4 integers, with the minimum specifiable poll rate, of 0 ms
(option -l 0). Per the following command line, I'm reading 4 registers, addresses 1-4:

modpoll -m tcp -r 1 -c 4 -t 3 -l 0 192.168.0.10

I achieve a poll rate, measured using an incrementing counter and stopwatch, of around 990Hz.
If I try dropping the specified poll rate, to 1 ms (option -l 1), then I achieve a measured poll rate of around 710Hz.
I can do anything I like on my PC, and there are no upsets to the ModbusTCP comms.
So far, so good.

Now, I try 'merging' the two projects, so I have EtherCAT and ModbusTCP, within the same project.
The EtherCAT still works fast. But unfortunately, the ModbusTCP becomes flaky:

After a few seconds running, ModPoll stops with a 'TCP/IP connection was closed by remote peer!' error.
Or, starts reporting continual 'Invalid MPAB indentifer!' errors, ModPoll can be stopped with Ctrl-C as usual.
Modpoll can be restarted, but only runs for a few seconds, before another error.

Infact, I could only get ModbusTCP to run for any length of time, by specifying a ModPoll poll rate, of 100 ms
(option -1 100). Although if I wait long enough, it also stops.

I have three LAN adaptors on my PC, one each for EtherCAT, ModbusTCP, and corporate LAN. So each has
a dedicated LAN adaptor. One adaptor is onboard, the other two are USB-to-LAN devices.

After many hours of comparing all elements of my 'merged' project, with my ModbusTCP-only project,
including comparing the APP properties, source code using WinMerge, etc, etc, and finding no
differences at all in the ModbusTCP implementation, I had a thought:

Try stripping out from the 'merged' project, the ECAT_SSC APP, the subsidiary clock-sync APPs,
and the EtherCAT callback code. And lo-and-behold, ModbusTCP robustness and performance, was restored.
So, seems something about having EtherCAT infrastructure present, is causing ModbusTCP flakiness.

Within the pre-strip copy of the 'merged' project, I notice that the interrupt handlers for the EtherCAT and ModbusTCP
APPS, namely ECAT_SSC_0 and ETH_LWIP_0, both have the maximum DAVE interrupt priority, of 63. I thought,
why not try lowering the former, to 62. But that had no effect, on the ModbusTCP flakiness.

I also have other priority 63 interrupt handlers in the project, so I tried lowering those also to 62. But likewise,
no effect on the ModbusTCP flakiness.

Infact, I electrically disconnected all external sources of interrupts other than ModbusTCP. Even the EtherCAT LAN cable.
But still, the ModbusTCP flakiness persisted. So, seems just the presence of EtherCAT infrastructure within the project,
is resulting in the ModbusTCP flakiness.

Lastly, I tried commenting within the main loop, everything except for ModbusTCP polling. I'm running RTOS-less,
per project specification. So I have a sys_check_timeouts call, in the main loop, as well as the ModbusTCP poll call.
Also an LED toggle call, that shows me that nothing is hanging within the main loop, causing the flakiness. So I'm at
a bit of a loss, if it's not main loop hanging, or interrupts, that are causing the ModbusTCP flakiness.

eMBPoll();
sys_check_timeouts();
XMC_GPIO_ToggleOutput(P_LED1);

Any ideas!?

Best regards,

David
0 Likes
6 Replies
Not applicable
Hi,

An update: I've now tried going with RTOS. I started with the example DAVE project for EtherCAT comms, namely ETHCAT_SSC_XMC48.
And again, got that working fine, serving up data fast, to the TwinCAT EtherCAT client.

Then, I manually 'layered' into the ETHCAT_SSC_XMC48 project, elements from the example DAVE project for ModbusTCP comms, namely MODBUS_TCP_MODE_XMC47.

By layered, I mean manually implementing the ETH_LWIP_0 APP, checking all the Properties are set the same, including the checked 'Enable RTOS' box. This caused the
CMSIS_RTOS_0 and CMS_RTOS_RTX_0 APPs to be created. I checked their Properties were set the same, too. For all three APPs, and those they connect to, I checked the
connections were visually the same. I also checked HW Signal Connections, Manual Pin Allocator, Manual Resource Assignment, were set the same.

I then copied over the Libraries\freemodbus-v1.5.0 folder, and set the include directories (ARM-GCC C Compiler > Directories) to be set the same, in the same order and positions:

"${workspace_loc:/${ProjName}/Libraries/freemodbus-v1.5.0/modbus/tcp}"
"${workspace_loc:/${ProjName}/Libraries/freemodbus-v1.5.0/modbus/rtu}"
"${workspace_loc:/${ProjName}/Libraries/freemodbus-v1.5.0/port}"
"${workspace_loc:/${ProjName}/Dave/Generated/ETH_LWIP}"
"${workspace_loc:/${ProjName}/Libraries/freemodbus-v1.5.0/modbus/include}"

I found that ModbusTCP, now running on a thread, was just as flaky as running RTOS-less. Also, EtherCAT stopped working, which I found was due to it's polling function, MainLoop, no longer being called, from within the main.c loop, as below. I found this is due to the osKernelStart() call never returning (is this 'normal' ?), despite the code below suggesting that execution continues after the call.

/*Thread for Modbus Polling*/
mbtcp_thread_id = osThreadCreate(osThread(mbtcp_thread), NULL);
/* Starting the Kernel for Scheduling*/
if(osKernelStart() == osOK)
{
/* Placeholder for error handler code.*/
XMC_DEBUG(("OS initialization failed\n"));
while(1U)
{
/* do nothing */
}
}

/* Placeholder for user application code. The while loop below can be replaced with user application code. */
while(1U)
{
MainLoop();
}

Moving the MainLoop() call to the mctcp_thread continuous loop, made EtherCAT work again, but of course this is rather pointless, as I now have an RTOS running a single thread, might as well be RTOS-less..

Next, I tried creating a dedicated thread for EtherCAT. Via LED toggling, and incrementing counters added to each thread and transferred to TwinCAT so I could observe, I can see both threads are executing. However, adding this EtherCAT thread, causes ModbusTCP to stop working completely (!). By which I mean, ping 192.168.0.10 no longer worked.

On a scope, I checked that for each thread, I see 50ms of LED toggling, then 50ms without, in accordance with the following default CMSIS_RTOS_RTX_0 settings: RTX Kernal Timer Tick Settings > RTX timer tick interval value [us] of 10000, and System Settings > Round-Robin timeout [ticks] of 5. Ie 10000us x 5 = 50000us = 50ms. I also tried reducing the ticks setting from 5 to 1, and indeed I could see on the scope, 5x more rapid switching, between the two tasks. But still no ping response. I also tried adjusting priority for each thread, from the osPriorityBelowNormal default, to various settings above and below, and could see that rebalanced CPU time accorded to each thread. But even with the ModbusTCP thread getting the majority of the time, still no ping.

I also tried lowering the ECAT_SSC_0 interrupt priority, from maximum 63, to 62, but still no ping.

Any ideas?

Best regards,

David King
0 Likes
Not applicable
Hi. OK, after a lot of experimenting, I've narrowed down the problem to the INTERRUPT_0 APP, of the example DAVE project for EtherCAT comms, namely ETHCAT_SSC_XMC48. Within the APP properties, unticking 'Enable interrupt at initialisation' checkbox resolves flaky ModbusTCP commms. EtherCAT continues to work fast. So first success!

Within the APP properties, the interrupt handler shows as 'ecat_ssc_timer_handler', greyed. Searching indicates the following, within xmc_eschw.c:

/* ECAT slave timer interrupt handler function */
void ecat_ssc_timer_handler(void)
{
ticks++;
ECAT_CheckTimer();
}

Searching ECAT_CheckTimer() finds that function within ecatappl.c. The comment text indicates it must be called every 1ms. Running for say 10s, then adding breakpoint on above ticks increment, indicates 'ecat_ssc_timer_handler()' is indeed being called every 1ms.

Rechecking the mentioned checkbox, but commenting the two statements as above within 'ecat_ssc_timer_handler()', re-instates flaky ModbusTCP. So the mere act of an interrupt occurring every 1ms, is sufficient to cause flaky ModbusTCP comms. Reducing the INTERRUPT_0 preemption priority, from default 30, to 29/1/0, has no effect.

The question then, is what within the example DAVE project for ModbusTCP comms, MODBUS_TCP_MODE_XMC47, is unable to tolerate interruption, every 1ms? And what can be changed within the project, to tolerate? (I've tried critical-section protecting tcp_write() and/or tcp_output(), within xMBTCPPortSendResponse() in porttcp.c, but that didn't work).

Meantime, I saw that ECAT_CheckTimer() comment text says 'If the switch ECAT_TIMER_INT is 0, the watchdog control is implemented without using interrupts. In this case a local timer register is checked every ECAT_Main cycle and the function is triggered if 1 ms is elapsed'. Unfortunately, ECAT_TIMER_INT is not acted on anywhere.

However, this comment text would suggest ECAT_CheckTimer() could be called instead within my main loop, say after the MainLoop() call for servicing EtherCAT. So, I tried this, together with unchecking the mentioned checkbox. And indeed obtain fast ModbusTCP+EtherCAT, as was the case for mentioned first success. ticks is unused anywhere, so forewent the ticks++.

A counter shows my main loop period, with both ModbusTCP and EtherCAT running, is around 12us. So, much quicker than 1ms period which the ECAT_CheckTimer() comment text indicates is required. Indeed, the ECAT_CheckTimer() code would suggest it needs to be run with that period. So, I changed to call ECAT_CheckTimer() on SYSTIMER_GetTickCount() change (SysTick period is set to default 1000us), and I continue to obtain fast ModbusTCP+EtherCAT.

DAVE Forum readers: Is this sort of battle with integrating DAVE APPS and DAVE example projects, the norm in your experience? I've spent approximately 40 hours on this one, trying many things. I'm hoping I can reduce, as I build experience with DAVE, if I can expect similar battles ahead..

If the author of the example DAVE project for ModbusTCP comms, MODBUS_TCP_MODE_XMC47 is reading, and has any insight on inability to tolerate 1ms period interrupt, I would appreciate. Or indeed, the authors of the FreeModbus and LwIP components of that project, if the issue is within those components.

An ongoing concern is whether less frequent interrupts from other sources, could likewise cause ModbusTCP comms flakiness.

Best regards,

David
0 Likes
Not applicable
Hi

OK, I now have EtherCAT and ModbusTCP working concurrently, for several hours running at least. I left running overnight, but this morning, I found ModbusTCP had stopped responding, and modpoll running on the PC wouldn't restart comms. Debugging, showed that I am seeing continual calls to IRQ_Hdlr_108() (ETH0_0_IRQHandler) in ethernetif.c. And thus applies whether or not modpoll is running. Infact, my main loop only single-steps for about 10 statements, before another call occurs, to IRQ_Hdlr_108(). My question is, whether this can be a 'normal' operation case, for the DAVE LWIP APP? Any insights?

Looking in a little detail, the handler is as below. This is within Dave \ Generated \ ETH_LWIP \ port \ netif \ ethernetif.c. As indeed, is all the code below.

void IRQ_Hdlr_108(void)
{
XMC_ETH_MAC_ClearEventStatus(&eth_mac, XMC_ETH_MAC_EVENT_RECEIVE);
ethernetif_input(&xnetif);
}

The called ethernetif_input() function, is as below. I see that low_level_input() is always returning 0. And so the remaining ethernetif_input() content, is skipped (I've replaced with : to minimise this post length):

/**
* This function should be called when a packet is ready to be read
* from the interface. It uses the function low_level_input() that
* should handle the actual reception of bytes from the network
* interface. Then the type of the received packet is determined and
* the appropriate input function is called.
*
* @param netif the lwip network interface structure for this ethernetif
*/
static void ethernetif_input(void *arg)
{
struct pbuf *p = NULL;
struct eth_hdr *ethhdr;
struct netif *netif = (struct netif *)arg;

p = low_level_input();

while (p != NULL)
{
ethhdr = p->payload;
:
p = low_level_input();
}
}

low_level_input() is as below. I see that XMC_ETH_MAC_IsRxDescriptorOwnedByDma(&eth_mac), is always returning true. And so the remaining low_level_input() content, is likewise skipped (I've likewise replaced with : to minimise this post length):

/**
* Should allocate a pbuf and transfer the bytes of the incoming
* packet from the interface into the pbuf.
*
* @param netif the lwip network interface structure for this ethernetif
* @return a pbuf filled with the received packet (including MAC header)
* NULL on memory error
*/
static struct pbuf * low_level_input(void)
{
struct pbuf *p = NULL;
struct pbuf *q;
uint32_t len;
uint8_t *buffer;

if (XMC_ETH_MAC_IsRxDescriptorOwnedByDma(&eth_mac) == false)
{
len = XMC_ETH_MAC_GetRxFrameSize(&eth_mac);
:
}
return p;
}

Restarting my software allows modpoll to restart comms. EtherCAT continued to work throughout.

Perhaps, the hardware signal that causes IRQ_Hdlr_108() (ETH0_0_IRQHandler), got stuck on permanently. Some intermittent problem, with the XMC4800 Relax Ethernet PHY (KSZ8081RNA)? To mention, presently all LWIP APP settings are at defaults, ie per the DAVE example project for Modbus TCP comms, with one exception: I have autonegotiation disabled, per suggestion in other posts. This so there isn't a 2s outage in ModbusTCP service, 1s after software start.

Best regards,

David King
0 Likes
User13086
Level 1
Level 1
Hi David,

with my LWIP/TCP project (raw api, no RTOS) i had similar problems.

After some hours of fine working sporadic tcp-errors occour.

It turned out that, if the Ethernet interrupt penetrates LWIP code (in main loop),
then LWIP data can be corrupted.

The solution was to protect all LWIP related code in the main-loop with
disabling the Ethernet IRQ.

e.g. the call to sys_check_timeouts():

NVIC_DisableIRQ(ETH0_0_IRQn);
sys_check_timeouts(); // --- LWIP call ---
NVIC_EnableIRQ(ETH0_0_IRQn);

Best regards,
Hans
0 Likes
Not applicable
Hi Hans,

Ah, thanks for that, interesting to hear you had same/similar issue, and that you resolved.

During my overnight run last night, the same occurred. I single-stepped the code this morning, and it's in the same state as described in my previous post. Restarting resolved in the same way too.

What I don't know is how many hours the code ran for, before the incident happened. For tonight, I'll add to my modpoll TCP batch file, a file write line after modpoll exit, the file timestamp will show the incident time.

About your Ethernet disable-interrupt-wrap solution, did you place also around the eMBPoll() call? That of course can call off to various LWIP functions, comprising tcp_recved/sndbuf/write/output and pbuf_free. If so, I'd perhaps be concerned a bit, about disabling Ethernet interrupts for too long, and it would be good to just wrap the specific offending call or calls..

Meantime, if the author of the example DAVE project for ModbusTCP comms, MODBUS_TCP_MODE_XMC47, is reading, and has any insight on this LWIP data-corruption, I would appreciate. Or indeed, the authors of the FreeModbus and LwIP components of that project, if the issue is within those components.

Best regards,

David
0 Likes
User13086
Level 1
Level 1
Hi David,

in my LWIP/TCP project (raw api, no RTOS) i don't use "FreeMODBUS".

But i also have functions similar to eMBPoll().
This functions are wrap-arounded with disable/enable Ethernet IRQ.


But you can also work without Ethernet-Interrupts:

Then you have to check "Poll for received data" in APP Dependency : ETH_LWIP_0 : Network Interface

Your code-loop then look like:

ETH_LWIP_Poll(); // poll for ETH packet reception
eMBPoll();
sys_check_timeouts();
XMC_GPIO_ToggleOutput(P_LED1);


Best regards,
Hans
0 Likes