I'm sure it's more complex than I grasp as a layperson, but I'm utterly amazed at how simple this _appears_. I get the feeling that this is something I have a better chance of understanding than the average SaaS Terms and Conditions.<p>I expected to have to scroll through pages upon pages of indecipherable text. Instead it's no bigger than a large paragraph of text, and I can easily fit it on my screen.
My first thought was `wdiff pdizer moderna`. It's short enough to post here in its entirity, but I guess I had better not, anyway it's easy enough to extract from the pdf. Add a space after every letter and wdiff can find the common sequences nicely.<p>Short except for flavor, this is from near the beginning:<p>A[-G-]AGA{+A+}GAA{+ATATAAGAC+}CCCG{+GCGCCG+}CCACCATGTTCGTGTTCCTGGTGCTGCTGCC[-T-]{+C+}
Despite how complex this really is, and how many "gotchas" there might be when using this repository, it's nice that it gets a shitload of attention. As a united humanity we should strive to solve our common problems.
If my little knowledge from biology class serves me correct, RNA uses Udenine instead of Thymine. But in this document it uses T.<p>Can somebody explain to me why?
Wow Looks like it is analogous to having a header on a TCP packet. [0] Here is an animation of mRNA encoding translated to proteins inside a ribosome. [1]<p>"The ribosome is composed of one large and one small sub unit that assemble around the messenger RNA, which then passes through the ribosome like a computer tape. The amino acid building blocks, that's the small glowing red molecules, are carried into the ribosome attached to specific transfer RNAs; that's the larger green molecules also referred to as tRNA. The small sub unit of the ribosome positions the mRNA so that it can be read in groups of three letters known as a codon."<p>Very analogous indeed.<p>[0] <a href="https://xerocrypt.wordpress.com/2014/07/22/how-to-read-almost-raw-tcpip-packet-headers-without-the-tools/" rel="nofollow">https://xerocrypt.wordpress.com/2014/07/22/how-to-read-almos...</a><p>[1] <a href="https://www.youtube.com/watch?v=TfYf_rPWUdY" rel="nofollow">https://www.youtube.com/watch?v=TfYf_rPWUdY</a>
The Human Genome Project was completed almost two decades ago, and somebody solved the protein folding problem recently.<p>Why are we still doing genetics at the machine code level? Shouldn't we have some compilers, assemblers and linkers by now?
I’m a little confused by the title? Looking at the document, it seems to me (knowing next to nothing about this field) it includes both Pfizer and Moderna’s protein spike sequence in figures 1 and 2, respectively. Is that correct?<p>It’s also interesting the way it’s worded: that the sequence was “assembled from $vaccine”. Does that mean whoever published this has backed into these sequences rather than having gathered this information directly from the source(s)?
we wrote some code last year to build a big Trie of the whole transcriptome -- you could use it to fuzzy-search to see if this mRNA is within some edit distance of any piece of normal human RNA, because then it could theoretically cause side effects via RNA interference. stopped the project because I can't afford to develop a gene therapy right now, but the fuzzy search worked<p><a href="https://github.com/bionicles/coronavirus" rel="nofollow">https://github.com/bionicles/coronavirus</a><p>to make the trie use the function here. the variable K is the length of the Kmers (runs of RNA). Larger values are gonna take a lot longer. ( warning: big job, uses multiprocessing...pypy recommended for speed )
<a href="https://github.com/bionicles/coronavirus/blob/b6f0db9dd8aaf7475aebd75dfcafe77194a65e8d/bio_firewall.py#L100" rel="nofollow">https://github.com/bionicles/coronavirus/blob/b6f0db9dd8aaf7...</a><p>then you could use this recursive function to generate potential matches within some cutoff
<a href="https://github.com/bionicles/coronavirus/blob/b6f0db9dd8aaf7475aebd75dfcafe77194a65e8d/bio_firewall.py#L174" rel="nofollow">https://github.com/bionicles/coronavirus/blob/b6f0db9dd8aaf7...</a><p>the function right below it converts the generator to a list. then you could save that<p>enjoy
What are the purple and blue sections after the stop codon for? I read a little about the 3' region, but for the vaccine, are these sections taken from a particular natural human sequence, or specially engineered for something else?
Related: Here's a article from late last year describing and explaining the source code of Pfizer vaccine:<p><a href="https://berthub.eu/articles/posts/reverse-engineering-source-code-of-the-biontech-pfizer-vaccine/" rel="nofollow">https://berthub.eu/articles/posts/reverse-engineering-source...</a><p>It's a very interesting read and I hope the author makes another post explaining the differences of the two mrna vaccines.
I highly recommend reading about Ribosomes. They are made up of two pieces that were likely independent at some time. It becomes quite clear that "life" began as a machine that all it could do was replicate itself:<p><a href="https://en.wikipedia.org/wiki/Ribosome" rel="nofollow">https://en.wikipedia.org/wiki/Ribosome</a><p>You can think of RNA as a copy of a section of DNA. They look very much like computer programs except rather than producing code, the Ribosome can read them and translate each codon for an amino acid into its corresponding actual amino acid that it then binds together into a protein. The execution engine is the environment of the cell. All highly probabilistic rather than deterministic. I can't imagine any programmer not finding them completely fascinating.
It's also short enough to post the whole thing to Wikipedia, so that's probably inevitable along with some <i>very</i> entertaining edit wars.
What this does, as a non-biotech person, I believe I understand at a high level: plonk this code into a ribosome and out comes the desired protein.<p>What I don't understand is:<p><pre><code> a) how the m-RNA code relates to the produced protein (i.e I can read C-code and get an idea of what is does fairly quickly, but can the same be said of m-RNA and the resulting protein)?
b) how did they get their hands on that code in the first place? Do the coronaviruses use m-RNA as well? Was then a coronavirus somehow "dissected" to get at the spike protein "source code"?</code></pre>
I compared the spike encoding regions, and it looks like they're quite different...I wonder if the codons wind up coding for different amino acids. And who got it right?
The lipid container is weird to me. Is that all it takes to send instructions inside a cell? Seems like a security hole. Why haven’t viruses evolved to have a lipid container?
People joked a lot about "injectible source code / machine code" but it is kind of interesting injecting yourself with something that has the source on github.
> So how different is the mRNA in the Moderna, BioNTech/Pfizer & CureVac vaccines? There are 1274 codon positions. 808 are identical across all 3 vaccines. 103 are unique to Moderna, 249 unique to BioNTech, 230 to CureVac<p><a href="https://twitter.com/PowerDNS_Bert/status/1375091898797453326" rel="nofollow">https://twitter.com/PowerDNS_Bert/status/1375091898797453326</a>
So you have a header/footer sequence that we sort of know is required (remember the MZ and chksum for .EXE files) but we have no idea what that bits in between does except we can read the letters and copied it in part from the actual virus.
'A group of Stanford researchers has hacked Moderna’s messenger RNA (mRNA) vaccine for the novel coronavirus, Motherboard first reported on Monday, and published its entire genetic sequence on the open-source code repository Github.'<p><a href="https://gizmodo.com/stanford-scientists-post-entire-mrna-sequence-for-moder-1846576268" rel="nofollow">https://gizmodo.com/stanford-scientists-post-entire-mrna-seq...</a>
So I guess Josiah Zayner has to pick up on this now and do a DIY Moderna COVID vaccine video. He already did a DIY vaccine video with full open source documentation on how to do it yourself.<p><a href="http://www.josiahzayner.com/2020/12/i-made-covid-19-vaccine-in-my-kitchen.html" rel="nofollow">http://www.josiahzayner.com/2020/12/i-made-covid-19-vaccine-...</a>
if you have understanding of how the sequence mutates then you can predict what the next strain is going to be and design spike protein that matches it.
ELI5 could this be used by "evil governments" to make designer pathogens to release during doomsday situations (say by North Korean leaders in their suicide bunkers if things went badly) ?
tangential: do biologists sometimes use some form of base 64 encoding for their triplets? so instead of AAG.TCA.GGA just g5F or something?<p>other than the obvious advantage of being shorter, it would also be easier to read: the boundaries would be unambiguous and each char would correspond directly to and amino acid (if applicable/coding)
This is amazing. It appears quite "simple" - of course I know nothing about this part of the sciences.<p>I do think back to the early days of Covid when there were all these predictions around when a vaccine would show up. It seemed like there was knowledge that the mRNA platform would be the likely solution and probably by April we knew a vaccine would be possible - it just took 6+ months to test.<p>Thinking about that timeline amazes me.
One of Modernas cofounders, MIT Prof Robert Langer, was profiled on 60 Minutes a few years back as MITs most prolific patent holder. He specialized in nanoparticle delivery systems to any desired internal tissue. One can deliver medicine, nutrients, diagnostics, etc where and when they want. Vaccines are just a small of subset of these applications.
As a software/hardware guy who knows less than zero about the subject: is this something that (given the right resources) makes possible to replicate the vaccines? I mean in countries where they can't afford enough vaccines but already have or could invest in the ability to replicate them without caring about patents.
<a href="https://github.com/brianherman/Assemblies-of-putative-SARS-CoV2-spike-encoding-mRNA-sequences-for-vaccines-BNT-162b2-and-mRNA-1273" rel="nofollow">https://github.com/brianherman/Assemblies-of-putative-SARS-C...</a>
I posted some txt files with the lines removed and stuff.
Is this all another medical company needs to start manufacturing and selling the vaccine themselves? Or is this sequence licensed/proprietary in some way?
i'm a dna noob: is it possible to do the growing and sampling thing to get the sequence from a sample of the vaccine or does the bubble of fat get in the way?
My question is does the Johnson & Johnson DNA-based vaccine encode for the exact same spike protein, or a different one they chose to target? From this PDF I conclude both the moderna and Pfizer vaccines target the same protein.