Accelerated PyTorch Training on M1 Mac

443 点作者 tgymnich大约 3 年前

23 条评论

lekevicius大约 3 年前

Curiously neither PyTorch nor Tensorflow currently use M1's Neural Engine. Is too limited? Too hard to interact with? Not worth the effort?

评论 #31426950 未加载

评论 #31425447 未加载

评论 #31425275 未加载

alexfromapex大约 3 年前

Since it's tangentially relevant, if you have an M1 Mac I've created some boilerplate for working with the latest Tensorflow with GPU acceleration as well: <a href="https://github.com/alexfromapex/tensorexperiments" rel="nofollow">https://github.com/alexfromapex/tensorexperiments</a> . I'm thinking of adding a branch for PyTorch now.

评论 #31425282 未加载

评论 #31425271 未加载

mkaic大约 3 年前

This is really cool for a number of reasons:1.) Apple Silicon currently can't compete with Nvidia GPUs in terms of raw compute power, but they're already way ahead on energy efficiency. Training a small deep learning model on battery power on a laptop could actually be a thing now.Edit: I've been informed that for matrix math, Apple Silicon isn't actually ahead in efficiency2.) Apple Silicon probably will compete directly with Nvidia GPUs in the near future in terms of raw compute power in future generations of products like the Mac Studio and Mac Pro, which is very exciting. Competition in this space is incredibly good for consumers.3.) At $4800, an M1 Ultra Mac Studio appears to be far and away the cheapest machine you can buy with 128GB of GPU memory. With proper PyTorch support, we'll actually be able to use this memory for training big models or using big batch sizes. For the kind of DL work I do where dataloading is much more of a bottleneck than actual raw compute power, Mac Studio is now looking very enticing.

评论 #31426095 未加载

评论 #31426240 未加载

评论 #31426235 未加载

评论 #31428217 未加载

评论 #31426615 未加载

评论 #31426674 未加载

评论 #31426074 未加载

评论 #31467291 未加载

评论 #31427317 未加载

评论 #31427419 未加载

ekelsen大约 3 年前

Nice results! But why are people still reporting benchmark results on VGG? Does anybody actually use this network anymore?Better would be mobilenets or efficientNets or NFNets or vision transformers or almost anything that's come out in the 8 years since VGG was published (great work it was at the time!).

评论 #31424723 未加载

评论 #31424793 未加载

评论 #31424811 未加载

评论 #31424671 未加载

评论 #31430971 未加载

singularity2001大约 3 年前

The installation command generated on <a href="https://pytorch.org/get-started/locally/" rel="nofollow">https://pytorch.org/get-started/locally/</a> didn't install the latest version for me. What did it was:pip3 install --pre torch==1.12.0.dev20220518 --extra-index-url <a href="https://download.pytorch.org/whl/nightly/cpu" rel="nofollow">https://download.pytorch.org/whl/nightly/cpu</a>

评论 #31458433 未加载

评论 #31433704 未加载

nafizh大约 3 年前

Exciting!! But don't see comparison with any laptop Nvidia GPUs in terms of performance. That would be insightful.

评论 #31425432 未加载

buildbot大约 3 年前

This is very interesting since the M1 studio supports 128GB of unified memory - training a large memory heavy model slowly on a single device could be interesting, or inferencing a very large model.

评论 #31425789 未加载

ivstitia大约 3 年前

There was a report comparing M1 Pro with several other Nvidia GPUs from a few months ago: <a href="https://wandb.ai/tcapelle/apple_m1_pro/reports/Deep-Learning-on-the-M1-Pro-with-Apple-Silicon---VmlldzoxMjQ0NjY3" rel="nofollow">https://wandb.ai/tcapelle/apple_m1_pro/reports/Deep-Learning...</a>I'm curious on how the benchmarks change with this recent new release!

almostdigital大约 3 年前

Anyone actually got this to run on an M1 Mac?<pre><code> $ conda install pytorch torchvision torchaudio -c pytorch-nightly Collecting package metadata (current_repodata.json): done Solving environment: failed with initial frozen solve. Retrying with flexible solve. Collecting package metadata (repodata.json): done Solving environment: failed with initial frozen solve. Retrying with flexible solve. PackagesNotFoundError: The following packages are not available from current channels: - torchaudio </code></pre> And the pip install variant installs an old version of torchaudio that is broken<pre><code> OSError: dlopen(/opt/homebrew/Caskroom/miniforge/base/envs/test123/lib/python3.10/site-packages/torchaudio/lib/libtorchaudio.so, 0x0006): Symbol not found: __ZN2at14RecordFunctionC1ENS_11RecordScopeEb</code></pre>

评论 #31430882 未加载

评论 #31436515 未加载

评论 #31453305 未加载

Scene_Cast2大约 3 年前

I'm curious about the performance compared to something like, say, the RTX 3070.

评论 #31424446 未加载

评论 #31424579 未加载

评论 #31425809 未加载

MasterScrat大约 3 年前

Small code example in the PyTorch doc:<a href="https://pytorch.org/docs/master/notes/mps.html" rel="nofollow">https://pytorch.org/docs/master/notes/mps.html</a>

评论 #31431522 未加载

singularity2001大约 3 年前

Anyone else getting "illegal hardware instruction"?(pytorch_env) ~/dev/ai/ python -c "import torch"

评论 #31426719 未加载

in3d大约 3 年前

It’s surprising to see PyTorch developers working on things like that when common operations like group convolutions are still completely unoptimized on Nvidia GPUs, despite many requests.

评论 #31428867 未加载

arecurrence大约 3 年前

This is much nicer ergonomics than what I had to do for tensorflow. It’s ostensibly out of the box support as a different torch device.

评论 #31424364 未加载

评论 #31424365 未加载

dilielloneluca大约 3 年前

I started collecting benchmarks of the M1 Max on PyTorch here: <a href="https://github.com/lucadiliello/pytorch-apple-silicon-benchmarks" rel="nofollow">https://github.com/lucadiliello/pytorch-apple-silicon-benchm...</a>

munro大约 3 年前

yess! This is important for me, because I don't have any $$$ to rent GPUs for personal projects. Now we just need M1 support for JAX.Since there are no hard benchmarks against other GPUs, here's a Geekbench against an RTX 3080 Mobile laptop I have [1]. Looks like it's about 2x slower--the RTX laptop absolutely rips for gaming, I love it.[1] <a href="https://browser.geekbench.com/v5/compute/compare/4140651?baseline=4529092" rel="nofollow">https://browser.geekbench.com/v5/compute/compare/4140651?bas...</a>

评论 #31427172 未加载

kristianp大约 3 年前

A tangential thought: will we see animation studios buy mac studios for their rendering farms? What do they use these days, aws ec2?

Kalanos大约 3 年前

Anyone care to comment on how this is better than Metal's TensorFlow support?

macshome大约 3 年前

Does this work on any Metal hardware or just the M1 GPU?

评论 #31430501 未加载

cj8989大约 3 年前

really hope to see some comparisons with nvidia gpus!

toppy大约 3 年前

Does speed up refer to absolute value or percentage?

评论 #31424615 未加载

sbeckeriv大约 3 年前

What is the * in the chart referencing?

评论 #31424429 未加载

amelius大约 3 年前

> Accelerated GPU training is enabled using Apple’s Metal Performance Shaders (MPS) as a backend for PyTorch.What do shaders have to do with it? Deep learning is a mature field now, it shouldn't need to borrow compute architecture from the gaming/entertainment field. Anyone else find this disconcerting?

评论 #31424526 未加载

评论 #31424520 未加载

评论 #31424528 未加载

23 条评论

lekevicius大约 3 年前

Curiously neither PyTorch nor Tensorflow currently use M1's Neural Engine. Is too limited? Too hard to interact with? Not worth the effort?

评论 #31426950 未加载

评论 #31425447 未加载

评论 #31425275 未加载

alexfromapex大约 3 年前

评论 #31425282 未加载

评论 #31425271 未加载

mkaic大约 3 年前

评论 #31426095 未加载

评论 #31426240 未加载

评论 #31426235 未加载

评论 #31428217 未加载

评论 #31426615 未加载

评论 #31426674 未加载

评论 #31426074 未加载

评论 #31467291 未加载

评论 #31427317 未加载

评论 #31427419 未加载

ekelsen大约 3 年前

评论 #31424723 未加载

评论 #31424793 未加载

评论 #31424811 未加载

评论 #31424671 未加载

评论 #31430971 未加载

singularity2001大约 3 年前

评论 #31458433 未加载

评论 #31433704 未加载

nafizh大约 3 年前

Exciting!! But don't see comparison with any laptop Nvidia GPUs in terms of performance. That would be insightful.

评论 #31425432 未加载

buildbot大约 3 年前

This is very interesting since the M1 studio supports 128GB of unified memory - training a large memory heavy model slowly on a single device could be interesting, or inferencing a very large model.

评论 #31425789 未加载

ivstitia大约 3 年前

almostdigital大约 3 年前

评论 #31430882 未加载

评论 #31436515 未加载

评论 #31453305 未加载

Scene_Cast2大约 3 年前

I'm curious about the performance compared to something like, say, the RTX 3070.

评论 #31424446 未加载

评论 #31424579 未加载

评论 #31425809 未加载

MasterScrat大约 3 年前

Small code example in the PyTorch doc:<a href="https://pytorch.org/docs/master/notes/mps.html" rel="nofollow">https://pytorch.org/docs/master/notes/mps.html</a>

评论 #31431522 未加载

singularity2001大约 3 年前

Anyone else getting "illegal hardware instruction"?(pytorch_env) ~/dev/ai/ python -c "import torch"

评论 #31426719 未加载

in3d大约 3 年前

It’s surprising to see PyTorch developers working on things like that when common operations like group convolutions are still completely unoptimized on Nvidia GPUs, despite many requests.

评论 #31428867 未加载

arecurrence大约 3 年前

This is much nicer ergonomics than what I had to do for tensorflow. It’s ostensibly out of the box support as a different torch device.

评论 #31424364 未加载

评论 #31424365 未加载

dilielloneluca大约 3 年前

munro大约 3 年前

评论 #31427172 未加载

kristianp大约 3 年前

A tangential thought: will we see animation studios buy mac studios for their rendering farms? What do they use these days, aws ec2?