here's the announcement email (very boiled down version of the readme, essentially):<p>babbage is a library for easily gathering data and computing summary measures in a declarative way.<p>The summary measure functionality allows you to compute multiple
measures over arbitrary partitions of your input data simultaneously
and in a single pass. You just say what you want to compute:<p><pre><code> > (def my-fields {:y (stats :y count)
:x (stats :x count)
:both (stats #(+ (or (:x %) 0) (or (:y %) 0)) count sum mean)})
</code></pre>
and the sets that are of interest:<p><pre><code> > (def my-sets (-> (sets {:has-y #(contains? % :y})
(complement :has-y))) ;; could also take intersections, unions
</code></pre>
And then run it with some data:<p><pre><code> > (calculate my-sets my-fields [{:x 1 :y 2} {:x 10} {:x 4 :y 3} {:x 5}])
{:not-has-y
{:y {:count 0}, :x {:count 2}, :both {:mean 7.5, :sum 15, :count 2}},
:has-y
{:y {:count 2}, :x {:count 2}, :both {:mean 5.0, :sum 10, :count 2}},
:all
{:y {:count 2}, :x {:count 4}, :both {:mean 6.25, :sum 25, :count 4}}}
</code></pre>
The functions :x, :y, and #(+ (or (:x %) 0) (or (:y %) 0)) defined in
the fields map are called once per input element no matter how many
sets the element contributes to. The function #(contains? % y) is also
called once per input element, no matter how many unions,
intersections, complements, etc. the set :has-y contributes to.<p>A variety of measure functions, and structured means of combining
them, are supplied; it's also easy to define additional measures.<p>babbage also supplies a method for running computations structured as
dependency graphs; this can make gathering the initial data for
summarizing simpler to express. To give an example that's probably
familiar from another context:<p><pre><code> > (defgraphfn sum [xs]
(apply + xs))
> (defgraphfn sum-squared [xs]
(sum (map #(* % %) xs)))
> (defgraphfn count-input :count [xs]
(count xs))
> (defgraphfn mean [count sum]
(double (/ sum count)))
> (defgraphfn mean2 [count sum-squared]
(double (/ sum-squared count)))
> (defgraphfn variance [mean mean2]
(- mean2 (* mean mean)))
> (run-graph {:xs [1 2 3 4]} sum variance sum-squared count-input mean mean2)
{:sum 10
:count 4
:sum-squared 30
:mean 2.5
:variance 1.25
:mean2 7.5
:xs [1 2 3 4]}
</code></pre>
Options are provided for parallel, sequential, and lazy computation of
the elements of the result map, and for resolving the dependency graph
in advance of running the computation for a given input, either at
runtime or at compile time.