科技回声

3 条评论

cs702超过 7 年前

> The basic idea is as follows: there is evidence that today's proteins emerged out of an ancient peptidic soup, one that may have left its mark on the evolutionary record. I.e., the proteins we see today may in some sense be formed out of primordial peptides. As proteins grew in size and complexity, it would have been advantageous to reuse existing components, to build bigger proteins from existing protein parts. We already know this is true on the level of protein domains, in that larger proteins are often comprised from chaining together smaller globular domains. But the phenomenon of reuse may go further, where even smaller protein fragments (handful of residues to dozens) may reflect an underlying evolutionary pressure to reuse working parts, fragments that fold in tried-and-tested ways (from the perspective of evolution.) If this is the case, then the space of naturally occurring proteins may occupy a very special "manifold", one that exhibits a hierarchical organization spanning small fragments to entire domains. Other evolutionary pressures could further drive the reuse phenomenon. For example, once a protein-protein or protein-DNA interface is established, presumably through some sort of structural motif, reusing that motif would present an efficient way for the cell to rewire its cellular circuitry. The end result of all this would be the emergence of something resembling a linguistic structure, a grammar that defines the reusable parts and how these parts can be combined to form larger assemblies. Given that this is biology, it’s unlikely to be rigid or minimal. It would be messy and hacky, with many exceptions and ad hoc evolutionary optimizations. But the manifold would be there, potentially discoverable and learnable.Instead of characters -> 'byte-pair-encoding'-like sequences -> words -> sentences, think primordial peptides -> simple protein parts -> more complicated protein components -> proteins. If this "protein linguistic hypothesis" is correct, I see no reason why the manifold wouldn't be discoverable and learnable with modern SGD techniques.

评论 #16389327 未加载

tritium超过 7 年前

I'm sure there's a predictable set of interactions, with a minimum, finite set of required loops to support cellular life as we know it. Above the minimum set of operations and repeatable cycles, there are almost certainly specialty routines, and perhaps no fixed limits on diversity of optional interactions, at the cellular/chemical level.But for sure, there is also a boundary layer, for interactions between cells. This would have to represent an almost entirely different set of chemical interaction rules for signaling, with its own constraints, minimum requirements, and optional expressions.So, it's useful to conceptualize in terms like this, but problems solved within the context of intracellular operations will only offer clues about tissue organization, and indeed, tissue requirements may drive the optional intracellular interactions more often than not, rather than the reverse. In cases where intracellular interactions drive extracellular organization, it's essentially leaky abstractions dictating the details of higher level implementation.

gilleain超过 7 年前

> the proteins we see today may in some sense be formed out of primordial peptides.This seems reasonable, but another possibility is that modern proteins (domains) are 'carved' from larger proteins that had a looser structure.In other words, primordial proteins could have been badly folded and mutations gradually improved them to smaller, better folded structures.

3 条评论

cs702超过 7 年前

评论 #16389327 未加载

tritium超过 7 年前

gilleain超过 7 年前

Protein Linguistics

3 条评论

Protein Linguistics

3 条评论