I have been interested in protein folding for a while and now have the time to dive into it.<p>I have waded through some articles on wikipedia and watched some youtube and done some googling but I haven't found any good resource for algorithms for protein folding or bioinformatics in general.<p>What can computer scientists do to help?<p><pre><code> *Invent better algoritms so biologists and chemists can run more experiments?
* Is there some constraint, some condition, some fold we can do to find a new medicine or something?
</code></pre>
A basic introduction to computational biology and relevant algorithms would be very appreciated.<p>I found this(genetic programming for protein folding):
http://www.techfak.uni-bielefeld.de/bcd/Curric/ProtEn/121.html<p>Ah and ofc MIT:
http://ocw.mit.edu/OcwWeb/Biology/7-91JSpring2004/CourseHome/
I'm actually a bioinformatics major, bioinformatician by profession. We do work with computer scientists a lot, but one thing we find very frustrating is that 99% of them don't know biology, biochemistry, chemistry, or organic chemistry to the necessary degree. They also don't know how to read lab tests we run or how to interpret them.<p>Personally, I think that computer programming is something that anybody can do. The advantage computer scientists have is a deep understanding of the inner workings of computers and computer language structure-- which is how computer scientists are able to optimize so well.<p>Since I'm currently working in the field of bioinformatics, I'll tell you this...KNOW YOUR SCIENCE!! You don't know how many times a computer scientist will optimize the hell out of an algorithm and make it look great and run like butter, but only to have it be junked in the end because is doesn't make any scientific sense.<p>As for the computer languages we use, we use perl a lot... too much even. perl has become our staple language because it takes less than a minute to write good script if you know what you're doing. python is popular too, but I'd say the majority of people use perl.<p>Another important language most bioinformaticians use is C/C++. Why? Wouldn't you want to use a faster language (C) to crunch 100 gigabits of genetic data instead of a slower one (perl).<p>And note, bioinformatics and computational biology are two different fields. This is a very common misconception. Do a little bit of research and you will discover this.<p>protein folding is one of the more prominent areas of biology being researched right now. Good luck with the learning and feel free to contact me.
After doing some more research I have found:
Perl and Python are very popular languages in bioinformatics. I love Python so good for me and I don't know Perl so easy choice then, Python it is.<p>The big wellknown library is BioPython:
<a href="http://biopython.org/wiki/Main_Page" rel="nofollow">http://biopython.org/wiki/Main_Page</a><p>Course, Bioinformatics and Python:
<a href="http://www.pasteur.fr/recherche/unites/sis/formation/python/" rel="nofollow">http://www.pasteur.fr/recherche/unites/sis/formation/python/</a><p>The foldit game:
<a href="http://fold.it/portal/adobe_main" rel="nofollow">http://fold.it/portal/adobe_main</a>
I suggest you start out with some basic chemistry. There you can find the data on how the amino acid chains line up and fold and twist. There was that game somebody released where you "fold" with the mouse. I think it was an experiment to find out if humans can do the folding faster then computers. That should give you an idea of the problem.<p>The core problem in protein folding is that all the parts heavily interact with each other. And that means that it is not easily split across nodes. Forget about networked nodes, the latency is way too big.<p>Some academic groups have proposed a funnel shape to the folding probabilities. That is to say that initially things could go in any direction and the possibility space is HUGE. But as the protein grows the number of possible moves collapse quickly and you are left with very few moves at the end.<p>I think the Features in Biotech podcast had at least one show on protein folding, fun to listen to.<p>Hope this helps, enjoy yourself.
For an overview, read the DoE's primer:<p><a href="http://www.ornl.gov/sci/techresources/Human_Genome/publicat/primer/toc.html" rel="nofollow">http://www.ornl.gov/sci/techresources/Human_Genome/publicat/...</a><p>The U.S. Department of Energy does a surprising amount of research on genetics and bioinformatics. The reason: while the Manhattan project was running, DoE scientists were aware that radioactive weapons would cause amazing and lasting damage, but really didn't know much about how radioactivity would affect living things specifically. So a parallel project was set up to study the effects of radiation on cells -- e.g. selectively damaging DNA and proteins and watching what happens to the organism. Research continued after the Manhattan project ended, and eventually led to the Human Genome Project.<p>Another resource you should definitely be familiar with is NBCI:<p><a href="http://www.ncbi.nlm.nih.gov/" rel="nofollow">http://www.ncbi.nlm.nih.gov/</a><p>Yes, algorithms are an important area of research. Caveat: it's entirely driven by biology. For example, aligning two partially matching protein sequences requires a clever algorithm. Sounds like diff, right? The catch is, related sequences don't match particularly well until you take into account which transformations are more likely to occur in nature, which takes significant biochemistry to determine and use properly. So really, your best bet is to associate yourself with a university of some sort, since that's where most of the molecular biologists tend to hang out. Learn biology first, and you'll pick up algorithms in the process.
Although not exactly my field... This seems to be a good review of some of the algorithms currently used.<p><a href="http://arxiv.org/abs/0707.3382" rel="nofollow">http://arxiv.org/abs/0707.3382</a><p>By following the references therein you can probably track down the canonical papers for the area. The OCW course should also give you a broad overview of the subject, but before you can make any significant contributions in terms of algorithms and results, you need to thoroughly understand the biology behind it.
Well, if you want to get a clue about how molecular biologists think you could do a lot worse then read "The eighth day of creation." It's a general history of molecular biology. If you find any of it confusing then a good introductory text book may help; Molecular Biology of the Cell or Stryer would get you started (and will be available in any decent college library).<p>As for the protein folding or the protein function question google around computational chemistry, but be warned -- this is tough stuff! But if you are shit hot, please come as we need the help...
<a href="http://www.amazon.com/Molecular-Biology-Made-Simple-Third/dp/1889899070/" rel="nofollow">http://www.amazon.com/Molecular-Biology-Made-Simple-Third/dp...</a><p>...seems to be a good introduction to molecular biology in general, depending on how much background you have.