TechEcho

6 comments

nuz11 months ago

I kind of respect deepmind for simply keeping their nose to the ground and doing good work like this without overhyping it too much. More of the under promise over deliver engineering style than some of their competitors tends to do it

评论 #40778690 未加载

dekhn11 months ago

No, it's not a miracle; everything it does works because the information to make those predictions is a collection of latent variables and DM found good ways to convert from sequence space into an embedding that approximates those latent variables.From what I can tell it still depends heavily on having a good sequence and structure template (or templates). It tells us little to nothing about the specific details of the folding process. To me the only part that seems miraculous is that it seems like we can predict novel structures (previously unknown conformations) using small fragments of templates rather than entire protein domains.

评论 #40782267 未加载

flobosg11 months ago

> Different proteins can also be related to each other. Even when the sequence similarity between two proteins is low, because of evolutionary pressures, this similarity tends to be concentrated where it matters, which is the binding site.It’s a small nitpick, but I think that the author actually meant “sequence identity” here, because his statement would make much more sense then. Sequence similarity is physicochemical in nature, and tends to be concentrated, in addition to functionally relevant sites (usually ligand-binding residues, as he mentioned), at key structural regions of the protein such as the hydrophobic core (where a high frequency of similarly hydrophobic residues is expected).This is one of the reasons why proteins from the same family can share highly similar structures while having very low sequence identity, with highly conserved motifs (where the sequence identity is concentrated) taking care of the functionality.

epups11 months ago

The author is making the point that Alphafold 3 is not so impressive - it is simply regurgitating its train set, and it's not so good for inference.I think his central point is fair and interesting. The test train split is apparently legit, as they used structures released before 2021 for training and the rest for testing. However, there was no real check for duplicates, and the success rate might be inflated by a bunch of "me too", low hanging fruit structures that are very slight variations from what we know.However, I'm not sure I agree with his skepticism. LLMs suffer from the exact same problems - getting it to write a Snake game in any language is trivial, but it is almost certainly regurgitating - , but can be useful as well. I mean, if for various reasons people are publishing very similar structures out there, there's certainly value in speeding up or reducing that work considerably.

评论 #40795078 未加载

seeknotfind11 months ago

Why overlapping molecules would indicate memorizing or over fitting is beyond me. Imagine a mechanic designing linkages. They may collide, but if they could pass through each other, they could work. Then they might reconfigure them. Similarly, overlapping molecules could be a step along the way to understanding if the algorithm is focused on binding structures rather than global physical structures.

maremmano11 months ago

I'm totally clueless about the topic (protein folding), but this stuff is very interesting. From the article, it seems that AlphaFold 3 is just a biochem version of GPT or what? From what I've heard, the older AlphaFolds had some special tricks for protein prediction. Am I missing something?

Are AlphaFold's new results a miracle?

6 comments

Are AlphaFold's new results a miracle?

6 comments