科技回声

Convert your vector embeddings into a set of questions and their ideal responses. Use this dataset to test your LLM and catch failures caused by prompt or RAG updates.Get started in 3 lines of code:```pip3 install fiddlecube``````from fiddlecube import FiddleCubefc = FiddleCube(api_key="<api-key>") dataset = fc.generate( [ "The cat did not want to be petted.", "The cat was not happy with the owner's behavior.", ], 10, ) dataset```Generate your API key: <a href="https://dashboard.fiddlecube.ai/api-key">https://dashboard.fiddlecube.ai/api-key</a># Ideal QnA datasets for testing, eval and training LLMsTesting, evaluation or training LLMs requires an ideal QnA dataset aka the golden dataset.This dataset needs to be diverse, covering a wide range of queries with accurate responses.Creating such a dataset takes significant manual effort.As the prompt or RAG contexts are updated, which is nearly all the time for early applications, the dataset needs to be updated to match.# FiddleCube generates ideal QnA from vector embeddings- The questions cover the entire RAG knowledge corpus.- Complex reasoning, safety alignment and 5 other question types are generated.- Filtered for correctness, context relevance and style.- Auto-updated with prompt and RAG updates.

7 条评论

Loic11 个月前

For the people wondering, the Github repo is only hosting a couple of lines of Python to connect to their API.If you have your own LLM, you may have sensitive/private data "in" it from your training. You may not be allowed to use this service from a legal point of view.

评论 #40801362 未加载

mistercow11 个月前

The bulleted list of what constitutes “ideal” is missing one of the most important types of questions: questions that aren’t answered by the knowledge set, but which seem like they should/might be.This is where RAG systems consistently fall down. The end user, by definition, doesn’t know what you’ve got in your data. They won’t ask questions carefully cherry-picked from it. They’ll ask questions they need to know the answer to, and more often than you think, those answers won’t be in your data. You absolutely must know how your system behaves when they do that.

评论 #40801381 未加载

johnsutor11 个月前

How does this differ from Ragas? <a href="https://docs.ragas.io/en/latest/index.html">https://docs.ragas.io/en/latest/index.html</a>

评论 #40792116 未加载

cruxcode11 个月前

Can it generate HTML as part of prompt?

评论 #40791642 未加载

praveenkumarnew11 个月前

Can I plug this into ragas pipeline

评论 #40822407 未加载

aditikothari11 个月前

This is super cool!

arjun964211 个月前

I want to hack

7 条评论

Loic11 个月前

评论 #40801362 未加载

mistercow11 个月前

评论 #40801381 未加载

johnsutor11 个月前

How does this differ from Ragas? <a href="https://docs.ragas.io/en/latest/index.html">https://docs.ragas.io/en/latest/index.html</a>

评论 #40792116 未加载

cruxcode11 个月前

Can it generate HTML as part of prompt?

评论 #40791642 未加载

praveenkumarnew11 个月前

Can I plug this into ragas pipeline

评论 #40822407 未加载

aditikothari11 个月前

This is super cool!

arjun964211 个月前

I want to hack

Show HN: FiddleCube – Generate Q&A to test your LLM

7 条评论

Show HN: FiddleCube – Generate Q&A to test your LLM

7 条评论