It looks indeed a lot like graphite, and since you explicitly mention in your
talk that your objective is to reimplement all the functions that are present
in graphite, why no instead present your work as a port of the graphite
language, with some extension to work on other data sources and sinks (and dots
replaced by the fat pipe)?<p>This is interesting to me as I'm currently working on something close: a
lightweight stream processor to allow system engineers to manipulate some large
streams of data while in flight to a database. And I've been wondering (and
still am) about the trade-offs between simple and expressive. Very early, I
decided not to be TS specific at all (since we were prevented to use an
off-the-shelf product for that reason that our data does not look enough like a
TS -- not a single time nor a single value fields). Eventually, after a few
detours, we ended up favoring a SQL like language for that reason that it's
field agnostic.<p>Regarding the language itself, the main differences I can see are that you
query over a time range while we process infinite streams, with the consequence
that we must explicitly tells each operation when it has to output values
(windowing); the other is that you have an implicit key and one TS by "group"
with the same key, which makes piping many operations easier (but JOINing
harder), while we have to be more specific about how to group.<p>So for instance, where you have:<p><pre><code> from(db:"foo") |> window(every:20s) |> sum()
</code></pre>
we would have the more SQL-alike:<p><pre><code> select sum value from foo group by time // 20
</code></pre>
("//" being the integer division).<p>Or, if you needed the start and stop additional columns added by window():<p><pre><code> select sum value, (time // 20)*20 AS start, start+20 AS stop group by start
</code></pre>
But then, because fluxlang process a range of time while we stream "forever" we
would also have to tell when to output a tuple, for instance after 20s has
passed:<p><pre><code> select sum value, (time // 20)*20 AS start, start+20 AS stop group by start commit after in.time > group.stop
</code></pre>
which gets verbose quickly.<p>But to us this constraint imposed by streaming (as opposed to querying a DB for
the data to process) is essential since our main use case is alerting from a
single box, so querying every minute the last 10 minutes of data for thousands
of defined alerts would just not work.<p>Another interesting difference is the type system. One thing I both like and
hate in SQL is the NULL. It's convenient for missing data but it's also the
SQL equivalent of the null pointer. So we have a type system that looks closely
on it: we support this special case of algebraic data type that a "type?" is a
NULLable "type", and that NULLs must be dealt with before they reach a function
that does not accept NULLs. For instance, there is no way to compile a filter
which condition can be NULL, and one would have to COALESCE it first. What's
your thoughts about missing data? Do you manage to avoid the issue entirely,
including after a JOIN operation?<p>The other difference I noticed is how nice your query editor is. For now our
query editor is $EDITOR, but my plan is to build a data source plugin for
Grafana. What do you think of this approach?