So these guys come along, casually expand a well-established code and test it under precisely one small condition. And when this one test gives them some nice data, they say: "Hey, our stuff is better than everything all of nature has ever done!"<p>A very intellectually stimulating endeavour no doubt, but I expect some more tests before I would call this good science. Claiming that "the new additions appear to improve the alphabet" is simply extrapolation to the nth degree. [1]<p>Oh and by the way, when the article claims that<p>> "the three-biopolymer system may have drawbacks, since information flows only one way, from DNA to RNA to proteins"<p>that is not correct either. For more information, read up on epigenetics.<p>[1] Note that this quote comes from the article, not the original paper. The original paper is not quite as cocky (at least not in the abstract, but I don't have full access).
There are actually 84 possible combinations with 4 base pairs if you accept sequences of length < 3.<p>However, if you assume all sequences are length 3, you still get 64 combinations.<p>We only use 20 out of that space. And if you look at how base pairs encode to amino acids, for half of them, only the first two base pairs even matter - since it's prefix-free you can guess the amino acid if you see those two and even ignore the third.<p>Given how underutilized this space is, I'm not convinced that increasing the domain to 216 will lead to much more than the ability to express our current amino acid space with only two base pairs.
It would have been nice if the author would have at least acknowledged that in reality they are nucleobases and not tiny, tiny letters curled up in our cell nuclei. Sure, 6-amino-5-nitro-2(1H)-pyridone and 2-amino-imidazo[1,2-a]-1,3,5-triazin-4(8H)one doesn't say much to us laymen, but just saying letters and not mentioning once what they stand for is really poor reporting.
Nitpick: it wouldn't be a potential 216. Some three-"letter" sequences code for the same amino acids, so instead of 4^3 (64) possible amino acids, only 20 are generated. Adding new letters doesn't change what these <i>old</i> words create, so I think there would only be a possible maximum of 172.<p>(I think I did my math right, but maybe not.)<p>(edit: thanks duaneb, had my basic bio facts wrong - codons code for amino acids, not proteins.)
I'm not convinced that this is necessarily a good idea biologically, especially after talking to a couple of my friends that are researchers in this space. However, this seems quite interesting for non-biological applications. Take cold storage, for example--with a third base pairing, we can obviously develop an even denser data storage format than with regular DNA.
Neat, but extending amino acids would be even cooler. DNA is mostly "just" a string encoding for information, like binary or hexadecimal. Proteins on the other hand are the actual machines whose blueprints are written in DNA, and they're built out of amino acids. Extending the set of amino acids could extend the set of basic building blocks available to create biomolecular machines.<p>Of course, teaching ribosomes to handle them and etc will take a lot of additional work, but identifying promising new amino acids would be a nice and major first step.
This sounds a lot like a story from 20 years ago, that was probably in Discover Magazine or Scientific American. The new nucleotides at that time were labeled kappa and chi.<p>And as a point of fact, three-base segments of DNA to not have a one-to-one mapping to amino acids. I also believe that a non-standard use of one of the three stop codons can change an encoded methionine to selenomethionine, with similar special cases for other proteins using rare amino acids.<p>Furthermore, 6^3=216, but that doesn't mean that adding a new base pair can code for that many amino acids. The original set of 4, with 64 possible codons, usually encode for 20 amino acids (excepting special cases as with selenomethionine). mRNA also employs uracil and tRNA adds hypoxanthine. These lead to "<i>wobble pairs</i>" which in turn allow a single tRNA to match several different-but-synonymous codons.<p>As it stands now, every codon without a matching tRNA would be a different variety of stop codon.<p>Now, what would be interesting to me is if the P-Z pairs could match some tRNA anticodons that translate stereoisomers of the standard 20 amino acids (or actually just the 19 that are chiral). That way, the D-(KLAKLAK)2 apoptosis promoter sequence could be synthesized directly by the ordinary transcription-translation mechanics of a cell.
This article is ignorant.<p>>Why nature stuck with four letters is one of biology’s fundamental questions. Computers, after all, use a binary system with just two “letters” — 0s and 1s. Yet two letters probably aren’t enough to create the array of biological molecules that make up life. “If you have a two-letter code, you limit the number of combinations you get,” said Ramanarayanan Krishnamurthy, a chemist at the Scripps Research Institute in La Jolla, Calif.<p>This simply isn't true. Even with regular DNA, the word size is 3 nucleotides long... giving you 64 instructions. If I remember my highschool biology, only some of these are even used, the rest are duplicates or unused.<p>Binary would work too, assuming ribosomes and mRNA could expand the word size... you only need 6 bits to do the same as natural DNA.<p>Is there something I don't know that fixes word size at 3 nucleotides?
Not sure I understand the benefit, it's denser, on the other hand from what I understand DNA generally does have much in the way of size constraints. If I remember large swathes of DNA is inactive and there isn't selective pressure to clean up this wasted space. Coupled with the fact that it is apparently more error prone and seems to show why evolution didn't go down this path.<p>Probabably will be very useful for synthetic purposes where there isn't too much concern about fidelity after 10 million years of copying.
Very interesting concept. One thing I noticed after developing several genetic algorithms on my own is that they tend to give a good creative <i>hint</i> at what the solution to the problem should be, which the human mind can then interpret and produce what the genetic algorithm was "trying" to approach. I wonder if the same could be true with biological evolution, that there are better ways of storing genetic information than DNA and all that, but that DNA is a good guideline to what should be done.
Even with PZ DNA would have major and minor groove rather than being a symmetrical double helix beloved by virtually all illustrators, sadly also those of pop sci articles...
The "enhanced" DNA escapes into the wild where a new pathogen spreads over earth. All life is defenseless against the bizzare genetic alphabet...