TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

An open source DuckDB text to SQL LLM

131 点作者 vgt超过 1 年前

6 条评论

swimwiththebeat超过 1 年前
I see so many business leaders touting the promise of LLMs allowing business to &quot;talk&quot; to their data. The promise does sound enticing, but it&#x27;s actually kind of hard to get working in practice.<p>A lot of our databases at work have columns with custom types and enums, and getting the LLM (Llama2) to write SQL queries to robustly answer natural language questions about the data is tough. It requires a lot of instruction prompting, context, and question-SQL examples (few-shot learning), and it still fails in unexpected ways. It&#x27;s a tough ask for people to use a tool like this if they can&#x27;t trust the results all the time. It&#x27;s also a bit infeasible to scale this to tens or hundreds of tables across our data warehouse.<p>It&#x27;s great that a lot of people are trying to crack this problem, I&#x27;m curious to try this model out. I&#x27;d also love to see if other people have tried solving this problem and made any headway.
评论 #39133155 未加载
评论 #39133177 未加载
评论 #39132538 未加载
评论 #39136953 未加载
评论 #39132978 未加载
评论 #39132742 未加载
评论 #39133130 未加载
评论 #39136394 未加载
评论 #39134143 未加载
vgt超过 1 年前
Co-founder and Head of Produck at MotherDuck here, happy to answer any questions or go nag the amazing engineers [0] who worked on this :)<p>[0]<a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;user?id=tdoehmen">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;user?id=tdoehmen</a>
评论 #39133170 未加载
评论 #39132872 未加载
评论 #39134443 未加载
datadrivenangel超过 1 年前
The core issue of text to SQL is that your data has to be good for the generated queries to be correct. The queries may run and return good looking results, but if the data requires domain knowledge (&quot;Don&#x27;t count people in the customer table without filtering out records with the test flag in the customer attributes table and at least one order in the orders table&quot;) you&#x27;ll get results that don&#x27;t actually answer your question.
b_mc2超过 1 年前
This is awesome, congratulations. I&#x27;m glad to see some text-to-sql models being created. Shameless plug: I also just realized you used NSText2SQL[1] which itself contains my text-to-sql dataset, sql-create-context[2], so I&#x27;m honored. I used sqlglot pretty heavily on it as well.<p>Do you think a 3B model might also be in the future, or something small enough that can be loaded up in Transformers.js?<p>[1] <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;datasets&#x2F;NumbersStation&#x2F;NSText2SQL" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;datasets&#x2F;NumbersStation&#x2F;NSText2SQL</a><p>[2] <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;datasets&#x2F;b-mc2&#x2F;sql-create-context" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;datasets&#x2F;b-mc2&#x2F;sql-create-context</a>
CastFX超过 1 年前
I&#x27;d love to see how it performs in some benchmarks, specifically against Spider (<a href="https:&#x2F;&#x2F;yale-lily.github.io&#x2F;spider" rel="nofollow">https:&#x2F;&#x2F;yale-lily.github.io&#x2F;spider</a>) and BIRD (<a href="https:&#x2F;&#x2F;bird-bench.github.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;bird-bench.github.io&#x2F;</a>)
aldarisbm超过 1 年前
looks great, most text-to-sql attempts i’ve tried fall short, hoping this is different
评论 #39132401 未加载