Show HN: Plexe – ML Models from a Prompt

124 点作者 vaibhavdubey972 天前

Hey HN! We’re Vaibhav and Marcello. We’re building Plexe (<a href="https://github.com/plexe-ai/plexe">https://github.com/plexe-ai/plexe</a>), an open-source agent that turns natural language task descriptions into trained ML models. Here’s a video walkthrough: <a href="https://www.youtube.com/watch?v=bUwCSglhcXY" rel="nofollow">https://www.youtube.com/watch?v=bUwCSglhcXY</a>.There are all kinds of uses for ML models that never get realized because the process of making them is messy and convoluted. You can spend months trying to find the data, clean it, experiment with models and deploy to production, only to find out that your project has been binned for taking so long. There are many tools for “automating” ML, but it still takes teams of ML experts to actually productionize something of value. And we can’t keep throwing LLMs at every ML problem. Why use a generic 10B parameter language model, if a logistic regression trained on your data could do the job better?Our light-bulb moment was that we could use LLMs to generate task-specific ML models that would be trained on one’s own data. Thanks to the emergent reasoning ability of LLMs, it is now possible to create an agentic system that might automate most of the ML lifecycle.A couple of months ago, we started developing a Python library that would let you define ML models on structured data using a description of the expected behaviour. Our initial implementation arranged potential solutions into a graph, using LLMs to write plans, implement them as code, and run the resulting training script. Using simple search algorithms, the system traversed the solution space to identify and package the best model.However, we ran into several limitations, as the algorithm proved brittle under edge cases, and we kept having to put patches for every minor issue in the training process. We decided to rethink the approach, throw everything out, and rebuild the tool using an agentic approach prioritising generality and flexibility. What started as a single ML engineering agent turned into an agentic ML "team", with all experiments tracked and logged using MLFlow.Our current implementation uses the smolagents library to define an agent hierarchy. We mapped the functionality of our previous implementation to a set of specialized agents, such as an “ML scientist” that proposes solution plans, and so on. Each agent has specialized tools, instructions, and prompt templates. To facilitate cross-agent communication, we implemented a shared memory that enables objects (datasets, code snippets, etc) to be passed across agents indirectly by referencing keys in a registry. You can find a detailed write-up on how it works here: <a href="https://github.com/plexe-ai/plexe/blob/main/docs/architecture/multi-agent-system.md">https://github.com/plexe-ai/plexe/blob/main/docs/architectur...</a>Plexe’s early release is focused on predictive problems over structured data, and can be used to build models such as forecasting player injury risk in high-intensity sports, product recommendations for an e-commerce marketplace, or predicting technical indicators for algorithmic trading. Here are some examples to get you started: <a href="https://github.com/plexe-ai/plexe/tree/main/examples">https://github.com/plexe-ai/plexe/tree/main/examples</a>To get it working on your data, you can dump any CSV, parquet, etc and Plexe uses what it needs from your dataset to figure out what features it should use. In the open-source tool, it only supports adding files right now but in our platform version, we'll have support for integrating with Postgres where it pulls all available data based on an SQL query and dumps it into a parquet file for the agent to build models.Next up, we’ll be tackling more of the ML project lifecycle: we’re currently working on adding a “feature engineering agent” that focuses on the complex data transformations that are often required for data to be ready for model training. If you're interested, check Plexe out and let us know your thoughts!

14 条评论

Stiopa2 天前

Awesome work.Only watched demo, but judging from the fact there are several agent-decided steps in the whole model generation process, I think it'd be useful for Plexe to ask the user in-between if they're happy with the plan for the next steps, so it's more interactive and not just a single, large one-shot.E.g. telling the user what features the model plans to use, and the user being able to request any changes before that step is executed.Also wanted to ask how you plan to scale to more advanced (case-specific) models? I see this as a quick and easy way to get the more trivial models working especially for less ML-experienced people, but am curious what would change for more complicated models or demanding users?

评论 #43907756 未加载

thefourthchime2 天前

This is a really interesting idea! I'll be honest, it took me a minute to really get what it was doing. The GitHub page video doesn't play with any audio, so it's not clear what's happening.Once I watched the video, I think I have a better understanding. One thing I would like to see is more of a breakdown of how this solves a problem that just a big model itself wouldn't.

评论 #43906629 未加载

Oras2 天前

I like the idea of trying multiple solutions.Does it decide based on data if it should make its own ML model or fine-tune a relevant one?Also, does it detect issues with the training data? When I was doing NLP ML models before LLMs, the tasks that took all my time were related to data cleaning, not the training or choosing the right approach.

评论 #43907266 未加载

dweinus2 天前

I don't want to hate, what you built is really cool and should save time in a data scientist's workflow, but... we did this. It won't "automate most of the ML lifecycle." Back in ~2018 "autoML" was all the rage. It failed because creating boilerplate and training models are not the hard parts of ML. The hard parts are evaluating data quality, seeking out new data, designing features, making appropriate choices to prevent leakage, designing evaluation appropriate to the business problem, and knowing how this will all interact with the model design choices.

评论 #43909518 未加载

评论 #43907160 未加载

fzysingularity2 天前

Is there a benchmark or eval for why this might be a better approach than actually modeling the problem? If you're selling this a non-ML person, I get the draw. But you'd still have to show why using these LLMs would be better than training it with something simpler / more lightweight.That said, it's likely that you'll get good zero-shot performance, so the model building phase could benefit from fine-tuning the prompt given the dataset - instead of training the underlying model itself.

评论 #43907353 未加载

marinr11 天前

hey this is very cool I work at a bank and we are starting to look at something like this mainly to automate boilerplate code for experimentation and model training however we are a GCP shop, I might play with this over the weekend to see if i can add support for vertex.ai experiments.Have you thought about extending this to cover the model development lifecycle and perhaps having agents to help with EDA, model selection, explanation and feature engineering? this is where we are seeing a lot of demand from users as well but we are starting out with experiment/ pipeline / serving boilerplate.

评论 #43917601 未加载

vessenes2 天前

I like this a lot, thank you for building it.Any review of smolagent? This combination of agents approach seems likely to be really useful in a lot of places, and I’m wondering if you liked it, loved it, hated it, …

评论 #43907046 未加载

评论 #43907035 未加载

gitroom1 天前

well i actually like when folks push old school ML instead of just LLM stuff everywhere. makes me feel like we're not losing the basics

评论 #43912309 未加载

revskill2 天前

Instead of "Attention is all we need", i expect an "Intention is all we need".

评论 #43907227 未加载

drlobster2 天前

That's great. Is there anyway to make it part of a scikit-learn compatible pipeline.?

评论 #43908578 未加载

MarcoDewey2 天前

I love that you all are doing real old school machine learning and not just LLM transformer based work!

评论 #43910227 未加载

srameshc2 天前

I am just trying to understand and an honest question: Are we getting a fine tuned model from the dataset ?

评论 #43909148 未加载

yu3zhou42 天前

Nice execution! I built a simpler version of it a year ago <a href="https://github.com/jmaczan/csv-to-ml">https://github.com/jmaczan/csv-to-ml</a> I hope you succeed with the product and push the automl forward

评论 #43907174 未加载

评论 #43907279 未加载

ratatoskrt2 天前

In my experience, humans are really bad at statistics and LLMs are even worse because they basically just mimic all the typical mistakes people make.

评论 #43906860 未加载