테크에코

1 comment

I was building an multi-agent system connected to Telegram. There is one agent that synthesises a response through 5+ other agents. Initially I was tweaking the system through my IDE, making small adjustments to promps to ensure that patterns and workflows where followed better. But I also started to interact while on the road, or just from the bed. And I got very frustrated by seeing some multi-step / multi-agent interactions go completely wrong, so I build in an additional architecting agent, which can make adjustments to the agents prompts (in terms of executing logic of tool calls) on the fly.<p>So if I saw something went wrong, I would say: "Next time don't do that, please do this instead" - Architect agent then reviews the entire tool and agent call chain, and makes a new adaptation to each agent (if necessary).<p>I was calling this "Poor man's RLHF" - it has been quite fun to interact with. Ended up making it so that this is a JSON file that I could later (potentially use for finetuning). But I was always wondering if there was a name for this? Is it the similar as DPO? I called it "behavioral adaptation". For a small system it was quite effective. But I also didn't bother to research it.

Direct Preference Optimization vs. RLHF

1 comment

Direct Preference Optimization vs. RLHF

1 comment