科技回声

9 条评论

SloopJon超过 4 年前

Enroot sounds interesting:<a href="https://github.com/NVIDIA/enroot" rel="nofollow">https://github.com/NVIDIA/enroot</a>It basically returns containers to their chroot origins, promising "no performance overhead." I'm looking forward to more posts on that.

评论 #24664711 未加载

评论 #24665005 未加载

kernelsanderz超过 4 年前

I was originally appalled at the software limiting. But according to Tim Dettmers who has a solid record of predicting and comparing NVIDIA cards for deep learning performance, it's not really a big deal.You can read his analysis here: <a href="https://timdettmers.com/2020/09/07/which-gpu-for-deep-learning/" rel="nofollow">https://timdettmers.com/2020/09/07/which-gpu-for-deep-learni...</a>and his tweet about this here: <a href="https://twitter.com/Tim_Dettmers/status/1311354118514982912" rel="nofollow">https://twitter.com/Tim_Dettmers/status/1311354118514982912</a>Essentially from my understanding it's memory bandwidth which is the real critical path on performance in most cases. The previous generation of Turing cards had more compute than was necessary so they were an underutilized resource.Also, this Puget benchmark is using an older version of the CUDA drivers. I believe performance is much better in CUDA 11.1.This new benchmark which is running on the latest CUDA seems to confirm Tim's numbers: <a href="https://www.evolution.ai/post/benchmarking-deep-learning-workloads-with-tensorflow-on-the-nvidia-geforce-rtx-3090" rel="nofollow">https://www.evolution.ai/post/benchmarking-deep-learning-wor...</a>

bitL超过 4 年前

When I first read the review, I couldn't understand why the author was mentioning that FP16 will surely be improved by new drivers despite not understanding that FP16 TFlops are exactly the same as FP32, tensor cores were nerfed and FP32 accumulate set to 0.5x speed, instead of Titan RTX's 1x. I'd say the results are as good as it gets, if you want a better performance, wait for Ampere Titan or Quadro.

评论 #24662027 未加载

评论 #24664125 未加载

ilzmastr超过 4 年前

I'm looking forward to multi-gpu tests!Would be good to see if it is worth upgrading x4 and x8 setups.Single gpu upgrade being worth it is a no-brainer. Launch price of the 30xx cards is lower that then purchase-able price of the two comparison cards!If only you could buy them though. The only microcenter in PA got 15 of each 30xx on respective launch days.If anyone knows how many of these cards are being produced please do share.

komuher超过 4 年前

Its without XLA and without cudnn and cuda 11.1 for ga102 lets wait for proper drivers to see full results :P

optimalsolver超过 4 年前

Is there any reason NVIDIA aren't selling A100s as individual cards?

评论 #24663727 未加载

rkwasny超过 4 年前

They just have the wrong ptxas - needs Cuda 11.1 and properly set $PATH.

ngcc_hk超过 4 年前

Guess that is maximum many of us can afford. Hence features that are missing from A100 is a bit moot. But the update we are wait. Still based on what we saw 3090 really does not worth the money. Still 24gb is 24gb.

评论 #24662778 未加载

评论 #24659978 未加载

unstatusthequo超过 4 年前

Neat, would love to try it but can’t buy the card anywhere. Another shitty Nvidia launch.

9 条评论

SloopJon超过 4 年前

评论 #24664711 未加载

评论 #24665005 未加载

kernelsanderz超过 4 年前

bitL超过 4 年前

评论 #24662027 未加载

评论 #24664125 未加载

ilzmastr超过 4 年前

komuher超过 4 年前

Its without XLA and without cudnn and cuda 11.1 for ga102 lets wait for proper drivers to see full results :P

optimalsolver超过 4 年前

Is there any reason NVIDIA aren't selling A100s as individual cards?

评论 #24663727 未加载

rkwasny超过 4 年前

They just have the wrong ptxas - needs Cuda 11.1 and properly set $PATH.

ngcc_hk超过 4 年前

评论 #24662778 未加载

评论 #24659978 未加载

unstatusthequo超过 4 年前

Neat, would love to try it but can’t buy the card anywhere. Another shitty Nvidia launch.

RTX3090 TensorFlow, NAMD and HPCG Performance on Linux

9 条评论

RTX3090 TensorFlow, NAMD and HPCG Performance on Linux

9 条评论