TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Stochastic gradient descent written in SQL

295 pointsby Lemaxoxoabout 2 years ago

30 comments

jackschultzabout 2 years ago
&gt;A machine learning algorithm which can be trained using SQL opens a world of possibilities. The model and the data live in the same space. This is as simple as it gets in terms of architecture. Basically, you only need a database which runs SQL.<p>First paragraph of the conclusion, and this very much fits with the mindset that&#x27;s been growing in me in the data world over the past few years. Databases are much more powerful than we think, they&#x27;re not going to go away, only get better, and having the data and the logic in the same space really removes tons of headaches. ML models, transformation of data, generating json for an API can all be done within the database rather than outside scripting language.<p>Are others seeing this? Are the current tradeoffs just that more people know python vs sql or database specific languages to where moving logic to postgres or snowflake is looked down on?
评论 #35056012 未加载
评论 #35061979 未加载
评论 #35059017 未加载
评论 #35055616 未加载
评论 #35056100 未加载
评论 #35061767 未加载
评论 #35059067 未加载
评论 #35056234 未加载
评论 #35056309 未加载
评论 #35059884 未加载
评论 #35055527 未加载
评论 #35057929 未加载
评论 #35058546 未加载
评论 #35055796 未加载
评论 #35055889 未加载
评论 #35060104 未加载
评论 #35055598 未加载
评论 #35058139 未加载
评论 #35059203 未加载
评论 #35064786 未加载
评论 #35064571 未加载
评论 #35056078 未加载
评论 #35060727 未加载
评论 #35061665 未加载
评论 #35055520 未加载
Terrettaabout 2 years ago
Title here is wrong. Title in article and headings in article are right: ONLINE gradient descent<p>It&#x27;s specifically <i>not</i> stochastic. From the article:<p><i>Online gradient descent</i><p><i>Finally, we have enough experience to implement online gradient descent. To keep things simple, we will use a very vanilla version:</i><p><i>- Constant learning rate, as opposed to a schedule.</i><p><i>- Single epoch, we only do one pass on the data.</i><p><i>- Not stochastic: the rows are not shuffled.</i> ⇠ ⇠ ⇠ ⇠<p><i>- Squared loss, which is the standard loss for regression.</i><p><i>- No gradient clipping.</i><p><i>- No weight regularisation.</i><p><i>- No intercept term.</i>
评论 #35060318 未加载
评论 #35059408 未加载
syatsabout 2 years ago
Just don&#x27;t.<p>SQL:<p>- does not allow for easy and clean importing of modules&#x2F;libraries<p>- is not easily to write tests for<p>- has limited support for a debugger<p>- lacks a consistent style for such large queries (plus most textbook cover fairly simple stuff) which means it&#x27;s hard for a developer to start reading someone else&#x27;s code (more than in other languages)<p>- clearly indicates in its name that it is a Query language.<p>Save yourself the trouble and all your collaborators the pain of working with this code in the future, of trying to add new features, of trying to reuse it in another project.<p>If you want to operate near the data, use PL&#x2F;Python for PostgreSQL.<p>EDIT: Fixed formatting.
评论 #35057315 未加载
评论 #35067828 未加载
zackmorrisabout 2 years ago
This is great! Moving away from the proprietary nature of GPUs and complex math gatekeeping should help democratize AI.<p>Has anyone converted stuff like gradient descent to set theory?<p><a href="https:&#x2F;&#x2F;support.unicomsi.com&#x2F;manuals&#x2F;soliddb&#x2F;7&#x2F;SQL_Guide&#x2F;2_GettingStartedSQL.03.04.html" rel="nofollow">https:&#x2F;&#x2F;support.unicomsi.com&#x2F;manuals&#x2F;soliddb&#x2F;7&#x2F;SQL_Guide&#x2F;2_G...</a><p><a href="https:&#x2F;&#x2F;www.sqlshack.com&#x2F;mathematics-sql-server-fast-introduction-set-theory&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.sqlshack.com&#x2F;mathematics-sql-server-fast-introdu...</a><p><a href="https:&#x2F;&#x2F;www.sqlshack.com&#x2F;learn-sql-set-theory&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.sqlshack.com&#x2F;learn-sql-set-theory&#x2F;</a><p>Right now AI algorithms kind of look imperative and stateful to me, like state machines. But there should be a functional or relational representation, where pattern matching would be used to derive the current state instead.<p>It&#x27;s trivial to go from functional to imperative representation, but often nearly impossible to go the other way. That&#x27;s why monadic (sorry if I&#x27;m mincing terms) logic is so troublesome. Mutable variables, futures&#x2F;promises, async, etc can&#x27;t be statically analyzed, which is why most code today is difficult to parallelize and stuff like C++ optimizers don&#x27;t even increase speed to the next order of magnitude. But functional languages have nearly infinite optimization potential through divide and conquer approaches like sharding and scatter-gather arrays that can run internally without side effects. In other words, we can throw hardware at SQL for linear speedup since it&#x27;s embarrassingly parallel, but might have limited success optimizing something like Python.
评论 #35062778 未加载
评论 #35062694 未加载
评论 #35063872 未加载
评论 #35061928 未加载
andrenotgiantabout 2 years ago
This is really interesting, but a basic part I don&#x27;t understand: What would it actually look like to run this on a live dataset?<p>If I understand correctly: you&#x27;d run the recursive query, it produces results for every step, effectively showing you the progression of output over time, and then once it hits &quot;present day&quot;, it completes and stops?<p>How would you generate results going forward? I.E. A minute elapses after the results return, do you have to re-run the whole query for all time?
评论 #35055021 未加载
评论 #35055676 未加载
krickabout 2 years ago
From the start I assumed this is a nice playful &quot;Hexing the technical interview&quot; kinda joke. But given the tone of the article, and some of the comments here… Uh, this cannot be serious, right?
评论 #35059555 未加载
评论 #35060146 未加载
glomgrilabout 2 years ago
Incredible post. I laughed when I saw the title, snickered at the first paragraph, and then proceeded to be blown away by the rest of it. Thought I was in for a joke and instead I&#x27;m thinking about the nature of ML Ops and what it&#x27;s become.
评论 #35063078 未加载
civilizedabout 2 years ago
In the comments here so far, we see a pattern we&#x27;ve seen before. When someone suggests doing something in SQL, there&#x27;s a lot of concern about SQL being a very limited programming language where it&#x27;s hard to do proper engineering.<p>Here&#x27;s I would really love to know: why is it that SQL is, to first order, the only language used to interact with databases, and SQL has about the same features as it did in the 70s? It seems analogous to if the general-purpose programming world stopped with C.
评论 #35060921 未加载
评论 #35058941 未加载
评论 #35059382 未加载
AndrewKemendoabout 2 years ago
This is genius I love it.<p>On stream inference with something like &quot;continuous&quot; data in the same structure I think is the final material form for &quot;AI&quot; so this is a great step towards that<p>Thanks for the writeup
pradeepchhetriabout 2 years ago
You might want to consider checking out ClickHouse which supports many ML functions natively:<p>- stochasticLinearRegression (<a href="https:&#x2F;&#x2F;clickhouse.com&#x2F;docs&#x2F;en&#x2F;sql-reference&#x2F;aggregate-functions&#x2F;reference&#x2F;stochasticlinearregression&#x2F;" rel="nofollow">https:&#x2F;&#x2F;clickhouse.com&#x2F;docs&#x2F;en&#x2F;sql-reference&#x2F;aggregate-funct...</a>)<p>- stochasticLogisticRegression (<a href="https:&#x2F;&#x2F;clickhouse.com&#x2F;docs&#x2F;en&#x2F;sql-reference&#x2F;aggregate-functions&#x2F;reference&#x2F;stochasticlogisticregression&#x2F;#agg_functions-stochasticlogisticregression" rel="nofollow">https:&#x2F;&#x2F;clickhouse.com&#x2F;docs&#x2F;en&#x2F;sql-reference&#x2F;aggregate-funct...</a>)
评论 #35055639 未加载
评论 #35076262 未加载
yttribiumabout 2 years ago
At least one F500 company had iteratively reweighted least squares implemented in MS SQL as of ~20y ago. When all you have is a hammer...
airstrikeabout 2 years ago
This is great. The only thing I dislike from this is using these variables to try to predict Adj Close when they are not at all correlated.<p>There are countless meaningful correlations in financial data that would have been just as easy to play around with. One truly valuable example would be to look at trading multiples of comparable companies. Sticking to P&#x2F;E would be easier as P is easily observable and forward-looking EPS estimates are generally always available. This would limit the exercise to more mature companies than the ones commonly discussed on HN but would make it actually meaningful
评论 #35060554 未加载
agnosticmantisabout 2 years ago
Just to disambiguate, here machine learning = linear regression. So no auto-grad needed as gradients are explicit and simple. Still interesting though.
episode0x01about 2 years ago
From a comment by the author:<p>&gt; Also, I can&#x27;t justify why, but my gut feeling is that the database should be an order of magnitude faster than Python, provided the implementation were to be improved.<p>Would be curious how that could end up being the case. Perhaps if NumPy wasn&#x27;t used at all? That would mean no vectorization or optimized storage.<p>Would be interesting to see how it scaled with length and dimensionality
victorNicolletabout 2 years ago
We&#x27;ve been doing this type of gradient descent in production for a couple of years now, and we&#x27;ve been very happy with it. A few insights from my experience:<p>One of the main benefits of doing differentiable programming over a relational database is that you can use joins to traverse the normalized data schema, and this acts as a very effective prior for your model. For example, you can learn parameters associated to shirt colors by having a parameter in the color table, and joining it with the shirt sales table (through the shirts table). And knowledge of the table structure also lets you improve convergence, for example by adding a factor to compensate for the fact that not all colors have the same number of sales (and therefore have been updated fewer times during each epoch). Here&#x27;s a paper from last year: <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2209.03771v1" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2209.03771v1</a><p>Automatic differentiation is an absolute must-have. Once you start having a few complex joins and aggregations, it&#x27;s too hard to derive the gradient by hand. Doing automatic differentiation on relational queries requires some adjustments (in particular, because you cannot use a tape), so you end up having to define a subset of relational algebra that is closed by automatic differentiation. The general ideas were presented in <a href="https:&#x2F;&#x2F;ceur-ws.org&#x2F;Vol-2971&#x2F;paper07.pdf" rel="nofollow">https:&#x2F;&#x2F;ceur-ws.org&#x2F;Vol-2971&#x2F;paper07.pdf</a><p>On the other hand, while it&#x27;s possible to do SQL-to-SQL automatic differentiation, the resulting queries have poor performance, so it&#x27;s better to go one step lower (at the level of the executor for an already-planned query) and perform the automatic differentiation there. And we had an excellent intern work with us on dedicated parallelization for gradient descent queries: <a href="https:&#x2F;&#x2F;blog.lokad.com&#x2F;pdf&#x2F;reproducible-parallel-sgd-ziyad-benomar-2022-09.pdf" rel="nofollow">https:&#x2F;&#x2F;blog.lokad.com&#x2F;pdf&#x2F;reproducible-parallel-sgd-ziyad-b...</a><p>Finally, in practice, it usually doesn&#x27;t matter if the gradient descent is stochastic, so long as you run several passes over the data.<p>I&#x27;m very interested in differentiable programming on relational languages, as the mainstream research concentrates on unstructured, low-information-density data (images&#x2F;sound&#x2F;text) when there are many domains where the data is more dense and structured, and (because of that structure) is stored in a database.
college_physicsabout 2 years ago
Is there a risk that postgres becomes sentient in our lifetime?
gigatexalabout 2 years ago
I thought I was decent at SQL until I saw this. Nice article!
saydaarkabout 2 years ago
This is interesting, but is not using the GPU.
freilanzerabout 2 years ago
Or use MindsDB: <a href="https:&#x2F;&#x2F;mindsdb.com&#x2F;">https:&#x2F;&#x2F;mindsdb.com&#x2F;</a>
Narewabout 2 years ago
There was also this neural network module in redis if you want to do training and inference in a redid DB. (quite old btw) <a href="https:&#x2F;&#x2F;github.com&#x2F;antirez&#x2F;neural-redis">https:&#x2F;&#x2F;github.com&#x2F;antirez&#x2F;neural-redis</a>
Rickasaurusabout 2 years ago
Please don&#x27;t do this in prod
评论 #35057343 未加载
Foobar8568about 2 years ago
R has been natively supported in sql server for a long while (At least since 2017), same for Java or python, e.g. Writing SP in these languages.
simonwabout 2 years ago
I tried to replicate this in SQLite. The first few steps worked OK, e.g.<p><a href="https:&#x2F;&#x2F;lite.datasette.io&#x2F;?json=https:&#x2F;&#x2F;gist.github.com&#x2F;simonw&#x2F;f750e08b33851a5ac8786d0b717a0152#&#x2F;data?sql=WITH+RECURSIVE%0A++++stream+AS+%28%0A++++++++SELECT+ROW_NUMBER%28%29+OVER+%28%29+AS+step%2C+%22Adj+Close%22+AS+x%0A++++++++FROM+raw%0A++++++++ORDER+BY+step%0A++++%29%2C%0A++++state%28step%2C+x%2C+avg%29+AS+%28%0A++++++++--+Initialize%0A++++++++SELECT+step%2C+x%2C+x+AS+avg%0A++++++++FROM+stream%0A++++++++WHERE+step+%3D+1%0A++++++++UNION+ALL%0A++++++++--+Update%0A++++++++SELECT%0A++++++++++++stream.step%2C%0A++++++++++++stream.x%2C%0A++++++++++++state.avg+%2B+%28stream.x+-+state.avg%29+%2F+stream.step+AS+avg%0A++++++++FROM+stream%0A++++++++INNER+JOIN+state+ON+state.step+%2B+1+%3D+stream.step%0A++++%29%0A%0ASELECT+*%0AFROM+state%0AORDER+BY+step+DESC%0ALIMIT+5" rel="nofollow">https:&#x2F;&#x2F;lite.datasette.io&#x2F;?json=https:&#x2F;&#x2F;gist.github.com&#x2F;simo...</a><p>(I replaced &quot;figures&quot; with &quot;raw&quot; due to the way Datasette Lite assigns a default table name to the imported JSON)<p>But the more complex recursive queries gave me this error and I&#x27;m not sure how to work around it:<p><pre><code> recursive reference in a subquery: state </code></pre> E.g. <a href="https:&#x2F;&#x2F;lite.datasette.io&#x2F;?json=https:&#x2F;&#x2F;gist.github.com&#x2F;simonw&#x2F;f750e08b33851a5ac8786d0b717a0152#&#x2F;data?sql=WITH+RECURSIVE%0A++++stream+AS+%28%0A++++++++SELECT%0A++++++++++++ROW_NUMBER%28%29+OVER+%28%29+AS+step%2C%0A++++++++++++%22Adj+Close%22+AS+x%2C%0A++++++++++++%22Close%22+AS+y%0A++++++++FROM+raw%0A++++%29%2C%0A++++state%28step%2C+x%2C+x_avg%2C+y%2C+y_avg%2C+cov%29+AS+%28%0A++++++++--+Initialize%0A++++++++SELECT%0A++++++++++++step%2C%0A++++++++++++x%2C%0A++++++++++++x+AS+x_avg%2C%0A++++++++++++y%2C%0A++++++++++++y+AS+y_avg%2C%0A++++++++++++0.0+AS+cov%0A++++++++FROM+stream%0A++++++++WHERE+step+%3D+1%0A++++++++UNION+ALL%0A++++++++--+Update%0A++++++++SELECT%0A++++++++++++step%2C%0A++++++++++++x%2C%0A++++++++++++x_new_avg+AS+x_avg%2C%0A++++++++++++y%2C%0A++++++++++++y_new_avg+AS+y_avg%2C%0A++++++++++++cov+%2B+%28%28x+-+x_prev_avg%29+*+%28y+-+y_new_avg%29+-+cov%29+%2F+step+AS+cov%0A++++++++FROM+%28%0A++++++++++++SELECT%0A++++++++++++++++stream.step%2C%0A++++++++++++++++stream.x%2C%0A++++++++++++++++stream.y%2C%0A++++++++++++++++state.x_avg+AS+x_prev_avg%2C%0A++++++++++++++++state.x_avg+%2B+%28stream.x+-+state.x_avg%29+%2F+stream.step+AS+x_new_avg%2C%0A++++++++++++++++state.y_avg+AS+y_prev_avg%2C%0A++++++++++++++++state.y_avg+%2B+%28stream.y+-+state.y_avg%29+%2F+stream.step+AS+y_new_avg%2C%0A++++++++++++++++state.cov%0A++++++++++++FROM+stream%0A++++++++++++INNER+JOIN+state+ON+state.step+%2B+1+%3D+stream.step%0A++++++++%29%0A++++%29%0A%0ASELECT+step%2C+cov%0AFROM+state%0AORDER+BY+step+DESC%0ALIMIT+5&amp;DOUBLE=" rel="nofollow">https:&#x2F;&#x2F;lite.datasette.io&#x2F;?json=https:&#x2F;&#x2F;gist.github.com&#x2F;simo...</a>
评论 #35060106 未加载
hubjaabout 2 years ago
Super intresting! Love the scrappy mindset ;)
评论 #35055223 未加载
PaulHouleabout 2 years ago
Reminds me of the time I coded up an optimized nearest neighbor search in SQL.
yzaabout 2 years ago
Next step would be to implement autodiff. If only PostgreSQL had CUDA support.
rlewkovabout 2 years ago
Uhhhh, ya know that Python has a library for that :-)
kunalguptaabout 2 years ago
god tier
kilgnadabout 2 years ago
An even MORE useful idea is to do this in CSS.
snookerdookerabout 2 years ago
I challenge you to add more emojis to your bio ;)
评论 #35055328 未加载