This is an interesting project, and it's illuminating to see what it takes to emulate some R features in Python (custom infix ops, non-standard evaluation, dataframes as namespaces/envrionments, etc.)<p>But I feel like it would be better to use method chaining for the piping of transformation rather than overloading dunder method operators. It would preserve one of the nice things about dplyr -- composing complicated transformations from a simple vocabulary, but more pythonic. This is a relative weakness I see in the design of pandas and would love to see ported over.<p>But also, dplyr is a thing that really goes beyond pandas. It's really an elegant, SQL-like DSL for transforming (mostly) arbitrary data. In this way it's more like LINQ than a specific implementation/API of a data structure.
It looks like both Dplython and pandas-ply are missing one of (what I think is) the core value propositions of dplyr: the ability to use the same abstractions on local data and on remote data, with execution against the remote source happening lazily such that the entire table doesn't need to be downloaded in order to run a filter locally.<p>(Of course, I may be biased in that I work on a commercial product which also has this characteristic.)
Piping in R is actually from <a href="https://github.com/smbache/magrittr" rel="nofollow">https://github.com/smbache/magrittr</a> and not Dplyr and is actually inspired from F#.<p>"R package to bring forward-piping features ala F#'s |> operator."
I was not aware it is possible with Python; but now I see: <a href="http://stackoverflow.com/questions/33658355/piping-output-from-one-function-to-another-using-python-infix-syntax" rel="nofollow">http://stackoverflow.com/questions/33658355/piping-output-fr...</a>