科技回声

7 条评论

wffurr超过 4 年前

Unfortunately, this feature is (still) stuck behind an origin trial and requires serving three different WebAssembly binaries to get correct fallback behavior across different browsers.Feature detection for WebAssembly[0] is stuck in spec discussions, and SIMD general availability is blocked on either that or its own mechanism for backwards compatibility[1].The issue is that a WebAssembly binary that contains instructions unknown to the engine (e.g. SIMD instructions not supported by a particular engine) won't validate, even if the functions aren't used at runtime. The only way to work around this is to compile your binary NxMx... times and detect which feature set is supported before loading a binary. It's a real pain in the tail when trying to support new WebAssembly features.e.g. check out this snippet from canvas.apps.chrome which supports WebAssembly threads on Chrome with a non-thread fallback for e.g. mobile / Firefox:<pre><code> var X; try { X = (new WebAssembly.Memory({ initial: 1, maximum: 1, shared: !0 })).buffer instanceof SharedArrayBuffer ? !0 : !1 } catch (a) { X = !1 } var ua = r(X ? ["js/threads/ink.js", "defines_threads.js"] : ["js/nothreads/ink.js", "defines.js"]) , va = ua.next().value , wa = ua.next().value; </code></pre> [0]: <a href="https://github.com/WebAssembly/conditional-sections" rel="nofollow">https://github.com/WebAssembly/conditional-sections</a> [1]: <a href="https://github.com/WebAssembly/simd/issues/356" rel="nofollow">https://github.com/WebAssembly/simd/issues/356</a>

etaioinshrdlu超过 4 年前

If I read this right, this is much faster than the WebGL backend on the devices tested.If the CPU is really faster than the GPU, that really demonstrates how inefficient the WebGL backend really is, compared to something like CUDA.

评论 #24893643 未加载

评论 #24895328 未加载

评论 #24900280 未加载

drej超过 4 年前

As for traditional TensorFlow, the easiest way we found to improve performance (easily 2x) was to find/create builds tailored to our machines. Using Python, we had prebuilt wheels, which have (understandably) low feature requirements. If you find/build your own (e.g. if you have AVX-512), you can easily get pretty detect performance gains.(Yes, there are unofficial wheels for various CPUs, but, not sure if that passes your security requirements.)

tpetry超过 4 年前

Looks alot like <a href="https://github.com/microsoft/onnxjs" rel="nofollow">https://github.com/microsoft/onnxjs</a> but onnx.js adds multithreading by web workers which will tske a long time to be available on wasm

dzhiurgis超过 4 年前

28ms on 2018 iPhone without threads or SIMD, 24ms on Chrome MBP 2019 with threads and no SIMD, 11ms with SIMD.

评论 #24895473 未加载

ajtulloch超过 4 年前

Awesome work Marat.

The_rationalist超过 4 年前

Couldn't tensorflow leverage webgl / webgpu? Also it's really sad that there no webCL adoption yet

7 条评论

wffurr超过 4 年前

etaioinshrdlu超过 4 年前

评论 #24893643 未加载

评论 #24895328 未加载

评论 #24900280 未加载

drej超过 4 年前

tpetry超过 4 年前

dzhiurgis超过 4 年前

28ms on 2018 iPhone without threads or SIMD, 24ms on Chrome MBP 2019 with threads and no SIMD, 11ms with SIMD.

评论 #24895473 未加载

ajtulloch超过 4 年前

Awesome work Marat.

The_rationalist超过 4 年前

Couldn't tensorflow leverage webgl / webgpu? Also it's really sad that there no webCL adoption yet

Supercharging TensorFlow.js with SIMD and multi-threading

7 条评论

Supercharging TensorFlow.js with SIMD and multi-threading

7 条评论