The top answer at the link explains it best:<p>Good observation! The 3' poly(A) tail is actually a very common feature of positive-strand RNA viruses, including coronaviruses and picornaviruses.<p>For coronaviruses in particular, we know that the poly(A) tail is required for replication, functioning in conjunction with the 3' untranslated region (UTR) as a cis-acting signal for negative strand synthesis and attachment to the ribosome during translation. Mutants lacking the poly(A) tail are severely compromised in replication.
If this sort of question fascinates you, you might like "Reverse Engineering the source code of the BioNTech/Pfizer SARS-CoV-2 Vaccine"[0], an article written with a tone that I've found to resonate with engineers and like-minded folk.<p>[0] <a href="https://berthub.eu/articles/posts/reverse-engineering-source-code-of-the-biontech-pfizer-vaccine/" rel="nofollow">https://berthub.eu/articles/posts/reverse-engineering-source...</a>
It's like a NOP slide for viruses: <a href="https://en.wikipedia.org/wiki/NOP_slide" rel="nofollow">https://en.wikipedia.org/wiki/NOP_slide</a><p>Just kidding...sort of!
I don't think this is nearly as true for virii genomes, but larger species have lots of protetive sections of DNA to protect from mutations. If you lose a non-protein-coding section of DNA to mutation, no harm to the species occurs. In humans, only about 1.5% of our DNA codes for protein that is actually generated. Virii are physically extremely tiny in terms of cell size and must be very efficient in terms of storing the DNA within them so way more actually codes, but no doubt there are similar factors at play.
It amazes me that the genome is only 29k long. If you were to write a computer virus now, it probably wouldn't be that short, let alone something that can infect and kill millions of people.
Very interesting. I first wondered how nature can randomly generate such a pattern, and then realized we are just falling for our "built in" pattern recognition: it would feel much more "natural" for the stop sequence to be the encoding of some specific protein without any clearly recognizable pattern... But it would actually be more unlikely to appear/survive mutation than "any long-enough sequence of A".<p>I also like how it is established that this has an effect on replication, but that as far as I understand we do not understand the underlying process. Humbling.
File formats are really easy to figure out and are a big advantage for moving data around. Even without an academic theory, pretty much everyone in software starts to figure out the same tricks as soon as reliable transmission becomes a goal. I assume that at least one reason for this is that genomes are data, data likes to live in structured formats, and file terminators are more reliable for biology to process than encoding the length of the genome (although, biology being messy, I wouldn't be shocked if both were done). Evolution has a good grasp of engineering principles.<p>Are there probably desirable chemical properties? Yes. Is nature overloading each part of a genome with uses? More than likely. Has it figured out how to terminate a sequence? Obviously.
I am entirely unqualified to answer, but I choose to believe it’s the equivalent of scratch (reserved stack) memory in an executable image. If I’m wrong, well at least I’m enjoying it.