TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

PyTorch Native Architecture Optimization: Torchao

169 点作者 jonbaer8 个月前

8 条评论

Atheb8 个月前
You got to give it to the pytorch team, they're really great at bringing complex optimization schemes (mixed-precision, torch.compile, etc) down to a simple to use API. I'm glad I moved from TF/Kerasto Pytorch around 2018-2019 and never looked back. I'm eager to try this as well.
评论 #41703843 未加载
评论 #41711351 未加载
formalsystem8 个月前
Hi! I'm Mark from the PyTorch team at Meta and work on torchao. If you have any questions about the library or really anything at all about performance, don't hesitate to ask!
评论 #41705491 未加载
评论 #41704007 未加载
评论 #41704986 未加载
评论 #41704059 未加载
评论 #41707633 未加载
评论 #41707630 未加载
评论 #41704295 未加载
tomrod8 个月前
This is a cool project! Understanding lower bits is still on my to do list, perhaps I'll spin this up for a go
majke8 个月前
Pardon my ignorance, but how do matrix operations on quantized data work? Is hardware support needed?<p>AFAIU int4 matrix multiplication is supported by cuda, but I&#x27;m not sure about other operations. The blog post mentioned fp6, and I don&#x27;t think this is supported by cuda. Or maybe the data are upscaled to something common like fp16 before doing math?
评论 #41710330 未加载
Evidlo8 个月前
&gt; We’re happy to officially launch torchao, a PyTorch native library that makes models faster and smaller by leveraging low bit dtypes<p>Will this let me use uint8 arrays as indexing arrays? A problem I have is that pytorch forces me to use uint64 for fancy indexing.
OutOfHere8 个月前
Any thoughts on XLA integration across PyTorch instead of keeping it in the separate torch-xla package?
lnyan8 个月前
Slightly off-topic: Is there a library in JAX that supports post-training quantization, similar to the one mentioned?
评论 #41709807 未加载
CalChris8 个月前
Is this what <i>Mojo</i> is supposed to be?