科技回声

11 条评论

exDM69超过 8 年前

There is no explanation how it works. Does it work on top of existing APIs in user space? Or is there a custom kernel driver bypassing user space?I've done some high throughput streaming from HD/SSD to GPU before, and it's pretty easy to beat the naive solution but getting the most out of it would require kernel space code.I was doing random access streaming of textures using memory mapped files for input and copying to persistent/coherent mapped pixel buffers on the CPU with memcpy with background threads. This was intended to take advantage of the buffer caches (works great when a page is reused) and intended for random access. If I would have been working on a sequential/full file upload, my solution would be entirely different.Edit: here's the source: <a href="https://github.com/kaigai/ssd2gpu" rel="nofollow">https://github.com/kaigai/ssd2gpu</a>It has a custom kernel module.

评论 #12525029 未加载

zokier超过 8 年前

This is very interesting in the light of recent AMD announcement of their "Solid State Graphics", ie GPU with SSD ducktaped on: <a href="http://www.anandtech.com/show/10518/amd-announces-radeon-pro-ssg-fiji-with-m2-ssds-onboard" rel="nofollow">http://www.anandtech.com/show/10518/amd-announces-radeon-pro...</a>

foobar2020超过 8 年前

This would be incredibly useful for distributed machine learning - imagine a Tensorflow implementation that almost entirely bypasses CPU.

评论 #12526294 未加载

评论 #12525297 未加载

witty_username超过 8 年前

So, if I understand correctly, data is being loaded directly from the SSD to the GPU and then filtered by the GPU before the CPU handles the more difficult queries.Neat.

justinclift超过 8 年前

This is very awesome. If further developed + made into a feasible option for PostgreSQL, this has potential to do interesting things to TPC benchmarks. :)

nl超过 8 年前

See also <a href="https://developer.nvidia.com/gpudirect" rel="nofollow">https://developer.nvidia.com/gpudirect</a> and to some extent <a href="https://en.wikipedia.org/wiki/NVLink" rel="nofollow">https://en.wikipedia.org/wiki/NVLink</a>.NVLink is in the Power9 servers Google is using.

评论 #12524992 未加载

评论 #12526657 未加载

carbocation超过 8 年前

I'm really hoping that Optane delivers on the hype, in which case our durable storage could be just 10x slower than RAM. At least, I imagine that it would be really helpful for speeding up even this approach.

Razengan超过 8 年前

I hope this brings us closer to widespread external GPUs, where you could use a slower-than-PCIe bus like Thunderbolt 3 or USB 3.1 to upload all assets to the EGPU's SSD during a one-time loading screen.

foobarbecue超过 8 年前

Direct Direct Memory Access? That's pretty direct.

评论 #12526254 未加载

musha68k超过 8 年前

Amazing results! We need more of that kind of thinking - GPU/SSD accelerate all the things!

MrBuddyCasino超过 8 年前

Who is providing the DMA engine in this case? Has the GPU access to PCIe device memory?

评论 #12525015 未加载

11 条评论

exDM69超过 8 年前

评论 #12525029 未加载

zokier超过 8 年前

foobar2020超过 8 年前

This would be incredibly useful for distributed machine learning - imagine a Tensorflow implementation that almost entirely bypasses CPU.

评论 #12526294 未加载

评论 #12525297 未加载

witty_username超过 8 年前

So, if I understand correctly, data is being loaded directly from the SSD to the GPU and then filtered by the GPU before the CPU handles the more difficult queries.Neat.

justinclift超过 8 年前

This is very awesome. If further developed + made into a feasible option for PostgreSQL, this has potential to do interesting things to TPC benchmarks. :)

nl超过 8 年前

评论 #12524992 未加载

评论 #12526657 未加载

carbocation超过 8 年前

Razengan超过 8 年前

foobarbecue超过 8 年前

Direct Direct Memory Access? That's pretty direct.

评论 #12526254 未加载

musha68k超过 8 年前

Amazing results! We need more of that kind of thinking - GPU/SSD accelerate all the things!

MrBuddyCasino超过 8 年前

Who is providing the DMA engine in this case? Has the GPU access to PCIe device memory?

评论 #12525015 未加载

GpuScan and SSD-To-GPU Direct DMA

11 条评论

GpuScan and SSD-To-GPU Direct DMA

11 条评论