Why do I get bad performance when compiling vector add example design with Intel® FPGA SDK for OpenCL™? - Why do I get bad performance when compiling vector add example design with Intel® FPGA SDK for OpenCL™?
Description Due to a problem in the Intel® FPGA SDK for OpenCL™ version 18.1 and later, you may get bad performance when you compile the same vector_add example design code. The performance is as follows. Intel® FPGA SDK for OpenCL™ version Performance V16.1 V18.0 V18.1 V19.1 ~3ms ~3ms ~170ms ~170ms Resolution To work around this problem, add an attribute to vector_add.cl which sets the required work group size. __attribute__((reqd_work_group_size(1, 1, 1))) __kernel void vector_add(__global const float *x, __global const float *y, __global float *restrict z) { // get index of the work item int index = get_global_id(0); // add the vector elements z[index] = x[index] y[index]; } The problem is scheduled to be fixed in a future release of the the Intel® FPGA SDK for OpenCL™.
Custom Fields values:
['novalue']
Troubleshooting
1507240683, 1807446135
False
['novalue']
['FPGA Dev Tools Quartus® Prime Software Pro']
novalue
18.1
['Arria® 10 FPGAs and SoCs', 'Stratix® 10 FPGAs and SoCs']
['HLD Tools OpenCL']
['novalue']
['novalue'] - 2021-08-25
external_document