Anyone doing probabilistic programming with big data? I started experimenting with probabilistic programming frameworks several years ago, but couldn't get it to scale to the level of data I'm working with (~6 dimensions ~100 trillion vectors), or even a small fraction of that. But, I'm sure it's being done in scientific circles somewhere.<p>Are there communities to collaborate on probabilistic programming? It seems like he domain knowledge is obscure enough that all the good information is locked up in the big corporations and academics.
Welcome to the project I've been waiting for <i>years</i> to get out of alpha. It's frustrating. If I had a hundred million dollars I'd burn a couple million getting this funded. It seems like it will be useful to humanity.
This sounds kind of similar to the stuff this startup called "Prior Knowledge" was working on prior to being acquired by Salesforce:
<a href="https://www.crunchbase.com/organization/prior-knowledge#/entity" rel="nofollow">https://www.crunchbase.com/organization/prior-knowledge#/ent...</a>
Glad to see this is out as well! Using probabilistic frameworks has the potential to eliminate a lot of the human error which can easily enter a large simulation. It's fair to say in the future probabilistic modules will become part of every standard library in every programming language, and distribution sampling functions will be as common as trig functions in a math library.<p>I am curious though how I would build up large queries in the BQL (SQL-like query language) or MML (meta-modeling language). For the orbital example, we conceivably only have a relatively low dimensional space. But what about a Bayes net for investigating genetic variants in a large genomic population? Doesn't this quickly become intractable?