I’ve been tasked with building an application that converts high-level natural language queries into meaningful insights from a database. The database consists of multiple tables that can be joined using a transaction ID. My current approach involves using GPT-4o to generate Python code based on the user’s query, which is then executed locally to extract insights (be it records, or graphs related to user query).
While the approach is sorta okay, I encountered an issue during a POC demo. One of the example queries resulted in incorrect results—the model used pd.merge but ended up performing a cross join, inflating the record count from 50k to 5 mil. Additionally, when queries are too high-level, the generated code often doesn’t yield the right results unless I explicitly specify which tables to use, which ideally shouldn’t be necessary.<p>To guide the model, I include a summary of the database schema, explaining each table and column, along with the user’s query at the beginning of the prompt. However, this doesn’t always lead to accurate Python translations.<p>Any idea how to better approach the problem? This tool is meant to be used by high level execs who have little to no CS knowledge. Also I'm restricted to OpenAI models for this.
There is a way to "make your own GPT" (<a href="https://help.openai.com/en/articles/8554397-creating-a-gpt" rel="nofollow">https://help.openai.com/en/articles/8554397-creating-a-gpt</a>) by feeding ChatGPT with some example questions and answers. If you include situations that ChatGPT is getting wrong, it should help.