科技回声

5 条评论

OpenMP is one of the easiest ways to make existing code run across CPU cores. In the simplest cases you simply add a single #pragma to C code and it goes N times faster. This is when you're running a function in a loop with no side effects. Some examples I've done:1) ray tracing. Looping over all the pixels in an image using ray tracing to determine the color of each pixel. The algorithm and data structures are complex but don't change during the rendering. N cores is about N times as fast.2) in Solvespace we had a small loop which calls a tessellation function on a bunch of NURBS surfaces. The function was appending triangles to a list, so I made a thread-local list for each call and combined them after to avoid writes to shared data structure. Again N times faster with very little effort.The code is also fine to build single threaded without change if you don't have OpenMP. Your compiler will just ignore the #pragmas.

评论 #42142590 未加载

评论 #42140296 未加载

评论 #42144512 未加载

fxj6 个月前

You can now (already in OpenMP5) use it to write GPU programs. Intels OneAPI uses OpenMP 5.5 to write programs for the Intel PonteVecchio GPUs which are on par to the Nvidia A100.<a href="https://www.intel.com/content/www/us/en/docs/oneapi/optimization-guide-gpu/2023-0/compiling-and-running-an-openmp-application.html" rel="nofollow">https://www.intel.com/content/www/us/en/docs/oneapi/optimiza...</a>gcc also provides support for NVidia and AMD GPUs<a href="https://gcc.gnu.org/wiki/Offloading" rel="nofollow">https://gcc.gnu.org/wiki/Offloading</a>Here is an example how you can use openmp for running a kernel on a nvidia A100:<a href="https://people.montefiore.uliege.be/geuzaine/INFO0939/notes/gpu/compileandrun/" rel="nofollow">https://people.montefiore.uliege.be/geuzaine/INFO0939/notes/...</a><pre><code> #include <stdlib.h> #include <stdio.h> #include <omp.h> void saxpy(int n, float a, float *x, float *y) { double elapsed = -1.0 \* omp_get_wtime(); // We don't need to map the variable a as scalars are firstprivate by default #pragma omp target teams distribute parallel for map(to:x[0:n]) map(tofrom:y[0:n]) for(int i = 0; i < n; i++) { y[i] = a * x[i] + y[i]; } elapsed += omp_get_wtime(); printf("saxpy done in %6.3lf seconds.\n", elapsed); } int main() { int n = 2000000; float *x = (float*) malloc(n*sizeof(float)); float *y = (float*) malloc(n*sizeof(float)); float alpha = 2.0; #pragma omp parallel for for (int i = 0; i < n; i++) { x[i] = 1; y[i] = i; } saxpy(n, alpha, x, y); free(x); free(y); return 0; }</code></pre>

评论 #42142494 未加载

Conscat6 个月前

OpenMP was pivotal to my last workplace, but because some customers required MSVC, we barely had support for OpenMP 2.0.

评论 #42144530 未加载

pornel6 个月前

I've used it a while ago, but got burned by very uneven support across compilers — MSVC required special tweaks, and old GCC would create crashy code without warning.It was okay for basic embarrassingly parallel for loops. I ended up not using any more advanced features, because apart from even worse compiler support, non-trivial multi-threading in C without any safeguards is just too easy to mess up.

dsp_person6 个月前

I was just googling to see if there's any Emscripten/WASM implementation of OpenMP. The emscripten github issue [1] has a link to this "simpleomp" [2][3] where> In ncnn project, we implement a minimal openmp runtime for webassembly target> It only works for #pragma omp parallel for num_threads(N)[1] <a href="https://github.com/emscripten-core/emscripten/issues/13892">https://github.com/emscripten-core/emscripten/issues/13892</a>[2] <a href="https://github.com/Tencent/ncnn/blob/master/src/simpleomp.h">https://github.com/Tencent/ncnn/blob/master/src/simpleomp.h</a>[3] <a href="https://github.com/Tencent/ncnn/blob/master/src/simpleomp.cpp">https://github.com/Tencent/ncnn/blob/master/src/simpleomp.cp...</a>

5 条评论

phkahler6 个月前

评论 #42142590 未加载

评论 #42140296 未加载

评论 #42144512 未加载

fxj6 个月前

评论 #42142494 未加载

Conscat6 个月前

OpenMP was pivotal to my last workplace, but because some customers required MSVC, we barely had support for OpenMP 2.0.

评论 #42144530 未加载

pornel6 个月前

dsp_person6 个月前

OpenMP 6.0

5 条评论

OpenMP 6.0

5 条评论