科技回声

It’s not new and only superior in a very narrow set of categories.

Needs a (2023) tag. But definitely the release of ARC2 and image outputs from 4o got me thinking about the JEPA family too.<p>I don't know if it's right (and I'm sure JEPA has lots of performance issues) but seems good to have a fully latent space representation, ideally across all modalities, so that the concept "an apple a day keeps the doctor away" becoming image/audio/text is a choice of decoder rather than dedicated token ranges being chosen even before the actual creation process in the model begins.

GPTs are in the “exploit” phase of the “explore-exploit” trade-off.<p>JEPA is still in the explore phase, it’s good to read the paper and have an understanding of the architecture to gain an alternative perspective.

Not new, not notable right now, not sure why it's getting upvoted (just kidding, it's because people see YLC and upvote based on names)

JEPA is presumably superior to Transformers. Can any expert enlighten us on the implications of this paper?

It’s not new and only superior in a very narrow set of categories.

Not new, not notable right now, not sure why it's getting upvoted (just kidding, it's because people see YLC and upvote based on names)

JEPA is presumably superior to Transformers. Can any expert enlighten us on the implications of this paper?

Self-Supervised Learning from Images with JEPA (2023)

5 条评论

Self-Supervised Learning from Images with JEPA (2023)

5 条评论