how to change h.parallel_for(range(M, P), [=](auto index) to single_task function - how to change h.parallel_for(range(M, P), [=](auto index) to single_task function Hi support team I modified the code from opeapi samples mul, and I wanna use it for FPGA hardware. I am not sure how to modified these three parallel_for function to single_task. May you give me some suggestions? below is the code: #if FPGA_EMULATOR // DPC++ extension: FPGA emulator selector on systems without FPGA card. ext::intel::fpga_emulator_selector d_selector; #elif FPGA // DPC++ extension: FPGA selector on systems with FPGA card. ext::intel::fpga_selector d_selector; #else // The default device selector will select the most performant device. default_selector d_selector; #endif try { queue q(d_selector, dpc_common::exception_handler); cout << "Device: " << q.get_device().get_info<info::device::name>() << "\n"; // Create 2D buffers for matrices, buffer c is bound with host memory c_back buffer<float, 2> a_buf(range(M, N)); buffer<float, 2> b_buf(range(N, P)); buffer c_buf(reinterpret_cast<float*>(c_back), range(M, P)); cout << "Problem size: c(" << M << "," << P << ") = a(" << M << "," << N << ") * b(" << N << "," << P << ")\n"; // Using three command groups to illustrate execution order. The use of // first two command groups for initializing matrices is not the most // efficient way. It just demonstrates the implicit multiple command group // execution ordering. // Submit command group to queue to initialize matrix a //start the clock // dpc_common::TimeInterval kernel_runtime; dpc_common::TimeInterval kernel_e_a_runtime; auto e_a = q.submit([&](auto& h) { // Get write only access to the buffer on a device. accessor a(a_buf, h, write_only); // Execute kernel. h.parallel_for(range(M, N), [=](auto index) { // Each element of matrix a is 1. a[index] = 1.0f; }); }); double elapsed_e_a_time = kernel_e_a_runtime.Elapsed(); dpc_common::TimeInterval kernel_e_b_runtime; // Submit command group to queue to initialize matrix b auto e_b = q.submit([&](auto& h) { // Get write only access to the buffer on a device accessor b(b_buf, h, write_only); // Execute kernel. h.parallel_for(range(N, P), [=](auto index) { // Each column of b is the sequence 1,2,...,N b[index] = index[0] + 1.0f; }); }); double elapsed_e_b_time = kernel_e_b_runtime.Elapsed(); dpc_common::TimeInterval kernel_e_c_runtime; // Submit command group to queue to multiply matrices: c = a * b auto e_c = q.submit([&](auto& h) { // Read from a and b, write to c accessor a(a_buf, h, read_only); accessor b(b_buf, h, read_only); accessor c(c_buf, h, write_only); int width_a = a_buf.get_range()[1]; // Execute kernel. h.parallel_for(range(M, P), [=](auto index) { // h.single_task<c_calc>([=]() [[intel::kernel_args_restrict]] { // for (int i = 0; i < M; i++) { //#pragma unroll 1 // for (int j = 0; j < P; j++) { // Get global position in Y direction. int row = index[0]; // int row = j; // Get global position in X direction. int col = index[1]; // int col = i; float sum = 0.0f; // Compute the result of one element of c //#pragma unroll 1 for (int i = 0; i < width_a; i++) { sum += a[row][i] * b[i][col]; } c[index] = sum; //c[i][j] = sum; // } // } }); }); Replies: Re: how to change h.parallel_for(range(M, P), [=](auto index) to single_task function Hi Wei-Chih , We do not receive any response from you to the previous question/reply/answer that I have provided. This thread will be transitioned to community support. If you have a new question, feel free to open a new thread to get the support from Intel experts. Otherwise, the community users will continue to help you on this thread. Thank you. Thanks. Regards, Aik Eu Replies: Re: how to change h.parallel_for(range(M, P), [=](auto index) to single_task function Hi Wei-Chih , I will close this thread if no further question. Thanks. Regards, Aik Eu Replies: Re: how to change h.parallel_for(range(M, P), [=](auto index) to single_task function Hi Wei-Chih , May I know is there any follow up from the previous comment? Thanks. Regards, Aik Eu Replies: Re: how to change h.parallel_for(range(M, P), [=](auto index) to single_task function Hi Wei-Chih , Sorry for late reply. I may not understand the written code that you are trying to work with. Can you provide more decription on the operation that you are trying to work on? I think you can split your task using a normal for loop as compared to using parallel_for: Brief example as below: // Computes the product of two square matrices. void matrix_multiply(double** m1, double** m2, double** result, size_t size) { for (size_t i = 0; i < size; i++) { for (size_t j = 0; j < size; j++) { double temp = 0; for (int k = 0; k < size; k++) { temp += m1[i][k] * m2[k][j]; } result[i][j] = temp; } } } // Computes the product of two square matrices in parallel. void parallel_matrix_multiply(double** m1, double** m2, double** result, size_t size) { parallel_for (size_t(0), size, [&](size_t i) { for (size_t j = 0; j < size; j++) { double temp = 0; for (int k = 0; k < size; k++) { temp += m1[i][k] * m2[k][j]; } result[i][j] = temp; } }); } There is this document that might help to consider how your code will be written: https://www.colfax-intl.com/downloads/oneAPI_module04_DPCplusplusFundamentals2of2.pdf Thanks. Regards, Aik Eu Replies: Re: how to change h.parallel_for(range(M, P), [=](auto index) to single_task function Hi A ikeu Follow the link tutorial to modify it, I will get wrong result. May you show me that you how to set the for loop in single_task function in this case?( h.parallel_for(range(M, P), [=](auto index) { ) Replies: Re: how to change h.parallel_for(range(M, P), [=](auto index) to single_task function Hi Wei-Chih , May I know does the link from previous comment help in answering your question? Thanks. Regards, Aik Eu Replies: Re: how to change h.parallel_for(range(M, P), [=](auto index) to single_task function Hi Wei-Chih , Can refer to the below link for your reference: https://www.intel.com/content/www/us/en/develop/documentation/oneapi-fpga-optimization-guide/top/optimize-your-design/throughput-1/single-work-item-kernels/single-work-item-kernel-design-guidelines.html Thanks. Regards, Aik Eu - 2022-08-02

external_document