科技回声

I’ve been tasked with building an application that converts high-level natural language queries into meaningful insights from a database. The database consists of multiple tables that can be joined using a transaction ID. My current approach involves using GPT-4o to generate Python code based on the user’s query, which is then executed locally to extract insights (be it records, or graphs related to user query). While the approach is sorta okay, I encountered an issue during a POC demo. One of the example queries resulted in incorrect results—the model used pd.merge but ended up performing a cross join, inflating the record count from 50k to 5 mil. Additionally, when queries are too high-level, the generated code often doesn’t yield the right results unless I explicitly specify which tables to use, which ideally shouldn’t be necessary.<p>To guide the model, I include a summary of the database schema, explaining each table and column, along with the user’s query at the beginning of the prompt. However, this doesn’t always lead to accurate Python translations.<p>Any idea how to better approach the problem? This tool is meant to be used by high level execs who have little to no CS knowledge. Also I'm restricted to OpenAI models for this.

2 条评论

WantonQuantum3 个月前

There is a way to "make your own GPT" (<a href="https://help.openai.com/en/articles/8554397-creating-a-gpt" rel="nofollow">https://help.openai.com/en/articles/8554397-creating-a-gpt</a>) by feeding ChatGPT with some example questions and answers. If you include situations that ChatGPT is getting wrong, it should help.

评论 #42895936 未加载

remram3 个月前

AI cannot write this kind of code unattended. This can't work.

评论 #42895848 未加载

Ask HN: How would you go about implementing a NL tool to gather insights?

2 条评论

Ask HN: How would you go about implementing a NL tool to gather insights?

2 条评论