Loop unrolling doesn't improve the processing time of radix-4 butterfly - Loop unrolling doesn't improve the processing time of radix-4 butterfly
I am running an emulation of an FFT using aocl 19.1.0.240 with s10_gh1e1_4Dx2 board. But when I tried comparing the radix-4 butterfly with (#pragma unroll before the for loop) and without loop unrolling, there is no difference in their processing time. Can anyone help me with this? Thank you so much.
Replies:
Re: Loop unrolling doesn't improve the processing time of radix-4 butterfly
Hi, I am unsure what the causes might be, we do have a tutorial on usage of loop unroll below: https://software.intel.com/content/www/us/en/develop/documentation/oneapi-fpga-optimization-guide/top/fpga-optimization-flags-attributes-pragmas-and-extensions/loop-directives/unroll-pragma.html https://github.com/intel/BaseKit-code-samples/blob/master/FPGATutorials/FPGAExtensions/LoopAttributes/loop_unroll/README.md You could try the example, and if the problem persists let me know. - 2020-06-19
external_document