I was using the UART example to send 64 bytes by direct method (polling) and by FIFO. I wondered why it took the same time to send them (149413 vs. 149368 cycles @ 144 MHz on XMC4800), although the FIFO should be much faster (using a FIFO buffer of 64 elements). The send method basically performs:

Code:
void
UART_lStartTransmitPolling(
		uint8_t		* data_ptr,
		uint32_t	count) {
		
	/* flush FIFO contents */
	XMC_UART0_CH0->TRBSCR = USIC_CH_TRBSCR_FLUSHTB_Msk;
	do {
		/* wait for TX FIFO to become partially empty */
		while (XMC_UART0_CH0->TRBSR & (1<<USIC_CH_TRBSR_TFULL_Pos)) {
		}
		/* put next byte into FIFO */
		XMC_UART0_CH0->IN[0] = *data_ptr++;
	} while (--count);
	/* wait for FIFO to become empty (= wait for full data sent) */
	while (!(XMC_UART0_CH0->TRBSR & USIC_CH_TRBSR_TEMPTY_Msk)) {
	}
}
The 1st (FIFO flush) and the last (FIFO empty wait) code lines nullify the optimization of a FIFO. Why would you actively wait until the FIFO has transfered all of its data? And if you do so, why would you flush FIFO contents before the next transfer starts, because there cannot be any data in FIFO on start of the method UART_lStartTransmitPolling(). Using it that way, it makes no difference whether to use the FIFO or not. Better:

Code:
void
UART_lStartTransmitPolling(
		uint8_t		* data_ptr,
		uint32_t	count) {
		
	do {
		/* wait for TX FIFO to become partially empty */
		while (XMC_UART0_CH0->TRBSR & (1<<USIC_CH_TRBSR_TFULL_Pos)) {
		}
		/* put next byte into FIFO */
		XMC_UART0_CH0->IN[0] = *data_ptr++;
	} while (--count);
}
Best regards,
Ernie T.