I work at a company that does applied machine learning and health care is the #1 source for prospective business. The hardest part isn't the analysis - it's getting good, valid data. Health care data is so sparsely collected, poorly structured (if at all) and the privacy issues surrounding gaining access are very strict (perhaps rightfully so).<p>The key, as IBM is doing, if working with a large HMO or health care network who hopefully have switched to a sensible EMR and have built up a good amount of historical data on patients.<p>I'd add one last key to getting this right isn't only the breadth of the data, but the depth. Knowing some superficial aspects of a person (age, weight, habits) is too naive. You need family history, you need psychosocial aspects (nightmares, trouble at work, marital problems etc.). If you can get <i>that</i>, then you're cooking.
We need to start using all the big data muscle on basic research of cell chemistry, less on analysis of patient data.<p>I could probably fit all well-structured data elements in existence for all medical patients in the US for a year on a single DVD. (and even the unstructured non-image data is pretty sparse.)<p>This is not really a problem where "big data" techniques are the right tool- There's too little data. However, I believe we should be experimenting on cell cultures in laboratories at large scales. None of this is happening, I believe... no laboratory is running hundreds of thousands of cultures in parallel with carefully manipulated chemical environments and generating data from them.<p>A data set generated from such cell culture analysis could be petabytes in size. With that type of data set I believe it would be possible, using big data techniques, to get a lot of rigorous and causative details of chemical cell pathways relevant to disease formation that we currently lack. This is the direction I think we need to be going.<p>I certainly would love to be wrong and would love to get news that IBM has cracked heart disease with this new project. However, my guess is that this is a very inefficient way to apply big data techniques to cure diseases.
For whose benefit? Speaking generally [1], as a patient in the U.S., I no longer believe it is for my (i.e. the patient's) benefit.<p>This is the result of both reporting on the health care landscape that I've digested as well as repeated personal experience.<p>The U.S. system is, at scale, profit-driven. Profit in general serves a purpose; however, in the U.S., it has superseded that of providing effective health care.<p>In other words, in the U.S., it has become solely about short-term, private profit. Public good and longer term, societal benefit have been relegated to imagery.<p>--<p>1. meaning not as someone having the specific condition under consideration
This, of course, has nothing to do with improving quality of life or preventing heart disease, but will have plenty to do with denying health insurance or increasing premiums.