TechEcho

8 comments

owlbiteover 1 year ago

These ML-compilers are being overhyped. It's all the same trade-off as a traditional compiler: you get a lot more throughput than hiring a specialist performance programmer, but the latter will typically outperform, possibly by orders of magnitude.These things are inferior at many levels: - Algorithmic: These things aren't feeding back to their human masters tips and tricks on how to modify the network to go faster beyond some very basic signals. - Loss of intent: ML network designers are specifying architecture in python, and by the time it's gone through many layers of lowering, you can get some complete garbage. Highly efficient garbage, but still garbage. (recent example, we caught one of these compilers doing a slice update operation by first forming the range of all possible indices to the array, slicing that to get indices to update, and then doing a scatter; we replaced it with a single memcpy call). - Inefficient kernels. Every time we see the output of these compilers go head-to-head with an expert assembly programmer, the compiler loses, often by 30%+. This always seems like the sort of thing that should be easy to solve, but given no-one seems to have cracked it in the past 50 years, it's obviously not as simple as it sounds.

评论 #38664379 未加载

评论 #38666067 未加载

评论 #38666405 未加载

评论 #38666442 未加载

评论 #38667190 未加载

评论 #38664312 未加载

dalbasalover 1 year ago

Can anyone bring this down to earth for me?What's the actual state of these "ML compilers" currently, and what is rhe near term promise?

评论 #38663119 未加载

评论 #38664223 未加载

评论 #38662567 未加载

aconz2over 1 year ago

summary: improve prediction of run-time performance of a computation graph using GNN, they use an embedding dictionary for each node's opcode along with some other node features (eg shape, bits, window size, see [1]), they released a big dataset of these graphs in [2] with varying XLA compilation configurations and their resulting perf on TPUs, they did some stuff to improve prediction on bigger graphs than before in [3] by partitioning the graph (METIS graph partition, new to me) and other training thingsThis is only about predicting performance of a given graph and not about improving/suggesting/editing a new equivalent graph. As in FunSearch, models which have decent predictive power could be used with evolutionary search.[1] <a href="https://github.com/google-research-datasets/tpu_graphs#features">https://github.com/google-research-datasets/tpu_graphs#featu...</a>[2] TpuGraphs: A Performance Prediction Dataset on Large Tensor Computational Graphs <a href="https://arxiv.org/abs/2308.13490" rel="nofollow noreferrer">https://arxiv.org/abs/2308.13490</a>[3] Learning Large Graph Property Prediction via Graph Segment Training <a href="https://arxiv.org/abs/2305.12322" rel="nofollow noreferrer">https://arxiv.org/abs/2305.12322</a>

potacover 1 year ago

Can anyone explain how conv works in that graph. You have a tensor of shape [2,4,16] and you convolve with a kernel of shape [4,16,8] and that gives you a [2,8] tensor? How's that possible?

评论 #38663496 未加载

GreedClarifiesover 1 year ago

How’s Gemini looking?

评论 #38662030 未加载

seydorover 1 year ago

What about transformer itself, any indication that it is optimal in some way?

ikersover 1 year ago

Feels like they bury the lede with the first paragraph, but otherwise cool stuff!

RyanShookover 1 year ago

The pace that ML seems to be advancing right now is amazing. I don’t believe in the singularity but it’s changing software and then society in ways no one can predict.

评论 #38661701 未加载

评论 #38662316 未加载

评论 #38661654 未加载

评论 #38662023 未加载

评论 #38661943 未加载

评论 #38662384 未加载

8 comments

owlbiteover 1 year ago

评论 #38664379 未加载

评论 #38666067 未加载

评论 #38666405 未加载

评论 #38666442 未加载

评论 #38667190 未加载

评论 #38664312 未加载

dalbasalover 1 year ago

Can anyone bring this down to earth for me?What's the actual state of these "ML compilers" currently, and what is rhe near term promise?

评论 #38663119 未加载

评论 #38664223 未加载

评论 #38662567 未加载

aconz2over 1 year ago

potacover 1 year ago

Can anyone explain how conv works in that graph. You have a tensor of shape [2,4,16] and you convolve with a kernel of shape [4,16,8] and that gives you a [2,8] tensor? How's that possible?

评论 #38663496 未加载

GreedClarifiesover 1 year ago

How’s Gemini looking?

评论 #38662030 未加载

seydorover 1 year ago

What about transformer itself, any indication that it is optimal in some way?

ikersover 1 year ago

Feels like they bury the lede with the first paragraph, but otherwise cool stuff!

RyanShookover 1 year ago

The pace that ML seems to be advancing right now is amazing. I don’t believe in the singularity but it’s changing software and then society in ways no one can predict.

Advancements in machine learning for machine learning

8 comments

Advancements in machine learning for machine learning

8 comments