TechEcho

Working with LLMs in existing pipelines can often be bloated, complex, and slow. That's why I created FlashLearn, a streamlined library that mirrors the user experience of scikit-learn. It follows a pipeline-like structure allowing you to "fit" (learn) skills from sample data or instructions, and "predict" (apply) these skills to new data, returning structured results.High-Level Concept Flow:Your Data --> Load Skill / Learn Skill --> Create Tasks --> Run Tasks --> Structured Results --> Downstream StepsInstallation:<pre><code> pip install flashlearn </code></pre> Learning a New "Skill" from Sample DataJust like a fit/predict pattern in scikit-learn, you can quickly "learn" a custom skill from minimal (or no!) data. Here's an example where we create a skill to evaluate the likelihood of purchasing a product based on user comments:<pre><code> from flashlearn.skills.learn_skill import LearnSkill from flashlearn.client import OpenAI # Instantiate your pipeline "estimator" or "transformer", similar to a scikit-learn model learner = LearnSkill(model_name="gpt-4o-mini", client=OpenAI()) data = [ {"comment_text": "I love this product, it's everything I wanted!"}, {"comment_text": "Not impressed... wouldn't consider buying this."}, # ... ] # Provide instructions and sample data for the new skill skill = learner.learn_skill( data, task=( "Evaluate how likely the user is to buy my product based on the sentiment in their comment, " "return an integer 1-100 on key 'likely_to_buy', " "and a short explanation on key 'reason'." ), ) # Save skill to use in pipelines</code></pre> skill.save("evaluate_buy_comments_skill.json")Input Is a List of DictionariesSimply wrap each record into a dictionary, much like feature dictionaries in typical ML workflows:<pre><code> user_inputs = [ {"comment_text": "I love this product, it's everything I wanted!"}, {"comment_text": "Not impressed... wouldn't consider buying this."}, # ... ] </code></pre> Run in 3 Lines of Code - Concurrency Built-in up to 1000 calls/min<pre><code> # Suppose we previously saved a learned skill to "evaluate_buy_comments_skill.json". skill = GeneralSkill.load_skill("evaluate_buy_comments_skill.json") tasks = skill.create_tasks(user_inputs) results = skill.run_tasks_in_parallel(tasks) print(results) </code></pre> Get Structured ResultsHere's an example of structured outputs mapped to indexes of your original list:<pre><code> { "0": { "likely_to_buy": 90, "reason": "Comment shows strong enthusiasm and positive sentiment." }, "1": { "likely_to_buy": 25, "reason": "Expressed disappointment and reluctance to purchase." } } </code></pre> Pass on to the Next StepsYou can use each record’s output for downstream tasks such as storing results in a database or filtering high-likelihood leads:<pre><code> # Suppose 'flash_results' is the dictionary with structured LLM outputs for idx, result in flash_results.items(): desired_score = result["likely_to_buy"] reason_text = result["reason"] # Now do something with the score and reason, e.g., store in DB or pass to next step print(f"Comment #{idx} => Score: {desired_score}, Reason: {reason_text}")</code></pre>

Show HN: I hacked LLMs to work like scikit-learn

no comments

Show HN: I hacked LLMs to work like scikit-learn

no comments