This is amazing. Jupyter is already a great tool for data science and being able to directly interact with SQL (without a host/intermediate language) is great, because it is very clean and SQL is a powerful language on its own – no need for some host language. Furthermore, vega (<a href="https://vega.github.io/vega-lite/" rel="nofollow">https://vega.github.io/vega-lite/</a>) is an amazing kind-of-declarative visualization language which mixes great with the declarative language SQL.
The only example of how to use it is an animation that is just impossible to read :(<p>Here are the examples in the live demo[1]:<p><pre><code> %LOAD sqlite3 db=chinook.db timeout=2 shared_cache=true
%XVEGA_PLOT X_FIELD EmployeeId Y_FIELD ReportsTo BIN TRUE MARK square WIDTH 100 HEIGHT 200 <> SELECT EmployeeId, ReportsTo FROM employees
%XVEGA_PLOT X_FIELD EmployeeId Y_FIELD ReportsTo BIN MAXBINS 3 MARK bar COLOR red WIDTH 200 HEIGHT 200 <> SELECT EmployeeId, ReportsTo FROM employees
%XVEGA_PLOT X_FIELD EmployeeId Y_FIELD ReportsTo TYPE ordinal MARK bar COLOR green WIDTH 200 HEIGHT 200 <> SELECT EmployeeId, ReportsTo FROM employees
%XVEGA_PLOT X_FIELD Name TYPE nominal Y_FIELD ArtistId BIN MAXBINS 1 MARK line COLOR purple WIDTH 200 HEIGHT 200 <> SELECT Name, ArtistId FROM artists LIMIT 10
%XVEGA_PLOT X_FIELD EmployeeId GRID false Y_FIELD ReportsTo MARK area COLOR pink WIDTH 200 HEIGHT 200 <> SELECT EmployeeId, ReportsTo FROM employees
</code></pre>
[1] <a href="https://hub.gke2.mybinder.org/user/jupyter-xeus-xeus-sql-x6434d86/lab" rel="nofollow">https://hub.gke2.mybinder.org/user/jupyter-xeus-xeus-sql-x64...</a>
The Github support for notebooks is so nice (was linked from the example pic caption: <a href="https://github.com/wangfenjin/xeus-tidb/blob/develop/examples/TiDB.ipynb" rel="nofollow">https://github.com/wangfenjin/xeus-tidb/blob/develop/example...</a>)
Jupyter would be even better if it supported the seamless combination of Python and SQL code cells.<p>My notebook code typically involves a data prep stage with querying a SQL database, then downloading into Python for more complex analysis, ML modelling, integration with external data sources, etc. So the notebook has a Python kernel with SQL usually as embedded """-quoted strings.<p>Does anyone have a solution to treating selected code cells as SQL - with SQL highlighting and tooltips - exposed as string variables to the Python code?<p>Sparkmagic [1] does part of this for Python/SQL/Spark interoperability, but as far as I recall, doesn't support SQL syntax highlighting.<p>[1] <a href="https://github.com/jupyter-incubator/sparkmagic" rel="nofollow">https://github.com/jupyter-incubator/sparkmagic</a>
I followed the instructions and the sql cells work, but the xvega ones return empty output and I'm not sure how to debug that.<p><pre><code> conda create -n xeus-sql
conda activate xeus-sql
conda install xeus-sql soci-mysql soci-postgresql soci-sqlite jupyterlab -c conda-forge
</code></pre>
In conda list I see, among others:<p><pre><code> xeus 1.0.0 h78d96c3_0 conda-forge
xeus-sql 0.0.8 h118ccdd_1 conda-forge
xvega 0.0.10 h4bd325d_0 conda-forge
xvega-bindings 0.0.10 h4bd325d_0 conda-forge
</code></pre>
Btw there's a typo in the instructions: soci-postresql -> soci-postgresql
This looks very promising! I will look into it.<p>When I want a tabular view, I currently either used the pandas read_sql_query method or the PandaSQL module. That does work, and I often do want to use pandas and sql together (often back and forth, depending on the operation).<p>That said, I can see a use for this tool in my work - I'm often really just interested in interacting with a database in a more visually friendly way than the command line, and it would be great to stay with Jupyter rather than having to go through a completely different UI.<p>So, yeah, nice looking tool, and thanks for posting!
We have used <a href="https://almond.sh/" rel="nofollow">https://almond.sh/</a> to create a Spark SQL interpreter using Jupyter Notebooks - plus a whole lot more which you can see here: <a href="https://arc.tripl.ai/tutorial" rel="nofollow">https://arc.tripl.ai/tutorial</a><p>After seeing many companies writing ETL using code we decided it was too hard to manage at scale so provided this abstraction layer - which is heavily centered around expressing business logic in SQL - to standardise development (JupyterLab) and allow rapid deployments.