TechEcho

13 comments

Resnet-50 with DawnBench settings is a very poor choice for illustrating this trend. The main technique driving this reduction in cost-to-train has been finding arcane, fast training schedules. This sounds good until you realize its a type of sleight of hand where finding that schedule takes tens of thousands of dollars (usually more) that isn't counted in cost-to-train, but is a real-world cost you would experience if you want to train models.However, I think the overall trend this article talks about is accurate. There has been an increased focus on cost-to-train and you can see that with models like EfficientNet where NAS is used to optimize both accuracy and model size jointly.

评论 #23742426 未加载

calebkaiseralmost 5 years ago

This is an odd framing.Training has become much more accessible, due to a variety of things (ASICs, offerings from public clouds, innovations on the data science side). Comparing it to Moore's Law doesn't make any sense to me, though.Moore's Law is an observation on the pace of increase of a tightly scoped thing, the number of transistors.The cost of training a model is not a single "thing," it's a cumulative effect of many things, including things as fluid as cloud pricing.Completely possible that I'm missing something obvious, though.

评论 #23740807 未加载

评论 #23740163 未加载

评论 #23741046 未加载

评论 #23740255 未加载

评论 #23744675 未加载

评论 #23741125 未加载

lukevpalmost 5 years ago

What are some domains that a solo developer could build something commercially compelling to capture some of this $37 trillion? Are there any workflows or tools or efficiencies that could be easily realized as a commercial offering that would not require massive man hours to implement?

评论 #23741293 未加载

评论 #23741694 未加载

评论 #23741544 未加载

评论 #23741625 未加载

评论 #23743493 未加载

评论 #23742304 未加载

anonualmost 5 years ago

Ark Invest are the creators of the ARKK [1] and ARKW ETFs that have become retail darlings, mainly because they're heavily invested in TSLA.They pride themselves on this type of fundamental, bottom up analysis on the market.It's fine.. I don't know if I agree with using Moore's law which is fundamentally about hardware, with the cost to run a "system" which is a combination of customized hardware and new software techniques[1] <a href="https://pages.etflogic.io/?ticker=ARKK" rel="nofollow">https://pages.etflogic.io/?ticker=ARKK</a>

评论 #23740244 未加载

gchamonlivealmost 5 years ago

I remember this article from 2018: <a href="https://medium.com/the-mission/why-building-your-own-deep-learning-computer-is-10x-cheaper-than-aws-b1c91b55ce8c" rel="nofollow">https://medium.com/the-mission/why-building-your-own-deep-le...</a>Hackernews discussion for the article: <a href="https://news.ycombinator.com/item?id=18063893" rel="nofollow">https://news.ycombinator.com/item?id=18063893</a>It really is interesting how this is changing the dynamics of neural network training. Now it is affordable to train a useful network on the cloud, whereas 2 years ago that would be reserved to companies with either bigger investments or an already consolidated product.

评论 #23739869 未加载

评论 #23740951 未加载

ersieesalmost 5 years ago

I would really like a thorough analysis on how expensive it is to multiply large matrices, which is the most expensive part of a transformer training for example according to the profiler. Is there some Moore’s law or similar trend?

评论 #23740519 未加载

mellosoulsalmost 5 years ago

It is regrettable if an equivalent to the self-fulfilling prophecy of Moore's "Law" (originally an astute observation and forecast, but not remotely a law) became a driver/limiter in this field as well, even more so if it's a straight transplant for soundbite reasons rather than through any impartial and thoughtful analysis.

评论 #23740061 未加载

gxxalmost 5 years ago

The cost to collect the huge amounts of needed to train meaningful models is surely not growing at this rate.

gentleman11almost 5 years ago

Despite nvidia vaguely prohibiting users from using their desktop cards for machine learning in any sort of data center-like or server-like capacity. Hopefully AMDs ml support / OpenCl will continue improving

评论 #23739733 未加载

sktguhaalmost 5 years ago

Does it mean that the cost to train something like gpt3 by OpenAI will reduce from 12 million dollars to less next year ? If so how much will it reduce to ?

m3kw9almost 5 years ago

It was probably because very inefficient to begin with.

评论 #23739946 未加载

bra-ketalmost 5 years ago

"AI" is not really appropriate name for what it is

seek3r00almost 5 years ago

tl;dr: Training learners is becoming cheaper every year, thanks to big tech companies pushing hardware and software.

13 comments

solidasparagusalmost 5 years ago

评论 #23742426 未加载

calebkaiseralmost 5 years ago

评论 #23740807 未加载

评论 #23740163 未加载

评论 #23741046 未加载

评论 #23740255 未加载

评论 #23744675 未加载

评论 #23741125 未加载

lukevpalmost 5 years ago

评论 #23741293 未加载

评论 #23741694 未加载

评论 #23741544 未加载

评论 #23741625 未加载

评论 #23743493 未加载

评论 #23742304 未加载

anonualmost 5 years ago

评论 #23740244 未加载

gchamonlivealmost 5 years ago

评论 #23739869 未加载

评论 #23740951 未加载

ersieesalmost 5 years ago

评论 #23740519 未加载

mellosoulsalmost 5 years ago

评论 #23740061 未加载

gxxalmost 5 years ago

The cost to collect the huge amounts of needed to train meaningful models is surely not growing at this rate.

gentleman11almost 5 years ago

评论 #23739733 未加载

sktguhaalmost 5 years ago

Does it mean that the cost to train something like gpt3 by OpenAI will reduce from 12 million dollars to less next year ? If so how much will it reduce to ?

m3kw9almost 5 years ago

It was probably because very inefficient to begin with.

评论 #23739946 未加载

bra-ketalmost 5 years ago

"AI" is not really appropriate name for what it is

seek3r00almost 5 years ago

tl;dr: Training learners is becoming cheaper every year, thanks to big tech companies pushing hardware and software.

The cost to train an AI system is improving at 50x the pace of Moore’s Law

13 comments

The cost to train an AI system is improving at 50x the pace of Moore’s Law

13 comments