Hugging Face Releases Agents

214 点作者 mach1ne大约 2 年前

17 条评论

I'm not 100% sure that AGI is guaranteed to end humanity like Yudkowsky, but if that's the course we're on, seeing news like this is depressing. Can anyone legitimately argue that LLMs are safe because they don't have agency, when we just straight up give them agency? I know current-generation LLMs aren't really dangerous -- but is this not likely to happen over and over again as our machine intelligences get smarter and smarter? someone is going to give them the ability to affect the world. They won't even have to try to "get out of the box", because it'll have 2 sides missing.I'm getting more and more on board with "shut it all down" being the only course of action, because it seems like humanity needs all the safety margin we can get, to account for the ease at which anyone can deploy stuff like this. It's not clear alignment of a super-intelligence is even a solvable problem.

评论 #35893606 未加载

评论 #35893695 未加载

评论 #35893104 未加载

评论 #35893805 未加载

评论 #35893692 未加载

评论 #35895996 未加载

评论 #35893053 未加载

评论 #35892969 未加载

评论 #35893501 未加载

rahimnathwani大约 2 年前

If you want an overview, scroll down to this part of the page: <a href="https://huggingface.co/docs/transformers/transformers_agents#whats-happening-here-what-are-tools-and-what-are-agents" rel="nofollow">https://huggingface.co/docs/transformers/transformers_agents...</a>In short:- they've predefined a bunch of tools (e.g. image_generator)- the agent is an LLM (e.g. GPT-*) which is prompted with the name and spec of each tool (the same each time) and the task(s) you want to perform- the code generated by the agent is run by a python interpreter that has access to these tools

samstave大约 2 年前

Asking for help from those that are smarter than I am ;;-One of the very common things for Martial Arts Books in the past, was the fact that one were presented with a series of pics, along with some descriptions about what was being done in the pics.Sometimes, these are really hard to interpolate between frames, unless you had a much larger repetoir of movements based on experience (i.e. a white belt vs another higher belt... e.g. a green belt will have better context of movement than a white belt...)--So can this be used to interpolate frames and digest lists (lists are what many martial arts count as documentation for their various arts...Many of these have been passed down via scrolls with either textual transmissions, paintings and then finally pics before vids existed...It would be really interesting to see if AI can interpret btwn images and or scroll text to be able to create an animation of said movements.---For example, not only was Wally Jay one of my teachers, but as the inventor (re-discoverer) of Small Circle JuiJitsu - his pics are hard to infer what is happening... because there is a lot of nuanced feeling in each movement that is hard to convey via pics/textBut if you can interpolate btwn frames, and model the movements, its game changing because through such interpolations on can imagine that you can get any angle of viewership -- and additionally, one can have the precise positioning and translucent display of bone/joint/muscle articulation such that one may provide for a deeper insight into the kinematics behind each movement.

评论 #35892677 未加载

评论 #35890264 未加载

senko大约 2 年前

I've been thinking lately of the two tiered reasoner + tools architecture inspired by LangChain, simonw's writing[0] and this is right along those lines.We're trying too hard to have one model do it all. If we coordinate multiple models + other tools (ala ReAct pattern) we could make the systems more resistant to prompt injection (and possibly other) attacks and leverage their respective strengths and weaknesses.I'm a bit wary of tool invocation via python code instead of prompting the "reasoning" LLM to teach it about the special commands it can invoke. Python's a good crutch because LLMs know it reasonably well (I use a similar trick in my project, but I parse the resulting AST instead of running the untrusted code) so it's simpler to prompt them.In a few iterations I expect to see LLMs fine tuned to know about the standard toolset at their disposal (eg. huggingface default tools) and further refinement of the two-tiered pattern.[0] <a href="https://simonwillison.net/2023/Apr/25/dual-llm-pattern/" rel="nofollow">https://simonwillison.net/2023/Apr/25/dual-llm-pattern/</a>

评论 #35895908 未加载

评论 #35891905 未加载

abidlabs大约 2 年前

Follow up Guide that explains how to create your own tools: <a href="https://huggingface.co/docs/transformers/custom_tools" rel="nofollow">https://huggingface.co/docs/transformers/custom_tools</a>

ed大约 2 年前

Cool! The DX is tricky to nail, when combined with LLM's tendency to hallucinate.I asked it to extract some text from an image, which it dutifully tried to do. However the generated python kept throwing errors. There's no image -> text tool yet, so it was trying to use the image segmenter to generate a mask and somehow extract text from that.It would be super helpful to:1) Have a complete list of available tools (and / or a copy of the entire prompt given to the LLM responsible for generating python). I used prompt injection to get a partial list of tools and checked the Github agent PR for the rest, but couldn't find `<<all_tools>>` since it gets generated at runtime (I think?).2) Tell the LLM it's okay to fail. E.g.: "Extract the text from image `image`. If you are unable to do this using the tools provided, say so." This prompt let me know there's no tool for text extraction.Update: per <a href="https://huggingface.co/docs/transformers/custom_tools" rel="nofollow">https://huggingface.co/docs/transformers/custom_tools</a> you can output a full list of tools with `print(agent.toolbox)`

syntaxing大约 2 年前

Whoa this is super awesome, kind of makes a ton of sense since HF pretty much dominates the market for model hosting and interfacing. The documentation actually looks about as complex as langchain. Gonna give it a go to query the docs with an agent to get an example (going full circle).

PaulHoule大约 2 年前

Kinda what people are asking for, I mean people are really attracted to "describe a task" as opposed to "create a training set".

nico大约 2 年前

They also released today StarChat, their code model fine tuned as an assistantMight be good to try with CodeGPT, AutoGPT or BabyAGI

minimaxir大约 2 年前

From the documentation, HF Agents are much better explained than LangChain but not easier to use, and due to multimodality it may actually be more arcane to use.

评论 #35895108 未加载

anton5mith2大约 2 年前

Could use LocalAI to get around this: “The openAI models perform better (but require you to have an openAI API key, so cannot be used for free);”<a href="https://www.reddit.com/r/selfhosted/comments/12w4p2f/localai_openai_compatible_api_to_run_llm_models/" rel="nofollow">https://www.reddit.com/r/selfhosted/comments/12w4p2f/localai...</a>

评论 #35895083 未加载

bluepoint大约 2 年前

If you are like me and you tried to copy paste the python commands and it did not work, you need to generate an access token. Here is what you should do:1. Sign up (<a href="https://huggingface.co/" rel="nofollow">https://huggingface.co/</a>) to hugging face.2. Setup access tokens (<a href="https://huggingface.co/settings/tokens" rel="nofollow">https://huggingface.co/settings/tokens</a>)3. Install or Upgrade some dependencies `pip install huggingface_hub transformers accelerate`4. From the terminal run `jupyter lab`5. Then, if I did not forget any other dependencies you can just copy paste```pythonfrom huggingface_hub import login from transformers import HfAgentlogin("hf_YOUR_HUGGING_FACE_TOKEN")agent = HfAgent("<a href="https://api-inference.huggingface.co/models/bigcode/starcoder" rel="nofollow">https://api-inference.huggingface.co/models/bigcode/starcode...</a>")agent.run("Is the following `text` (in Spanish) positive or negative?", text="¡Este es un API muy agradable!")```

chaxor大约 2 年前

This is beautiful, but is there a decent way to plow through say, 20TB of text and put that into a vector database (encoder only)? It would be quite a great addition, especially if the vectors could then be translated into other forms (different language, json representation, pull out names/NER, etc) by just applying a decoder to the database.

og_kalu大约 2 年前

If a typical LLM has decent representation of the languages in question (and you'd be surprised how little decent is with all the positive transfer that goes on during training) then outsourcing translation is just a downgrade. a pretty big one in fact.<a href="https://github.com/ogkalu2/Human-parity-on-machine-translations">https://github.com/ogkalu2/Human-parity-on-machine-translati...</a>T5 seems to be the default so i get why it's done here. Just an observation.

评论 #35891729 未加载

IAmStoxe大约 2 年前

This seems to be an interpretation similar to that of langchain.

sudoapps大约 2 年前

As this LLM agent architecture continues to evolve and improve, we will probably see a lot of incredible products built on top of it.

macrolime大约 2 年前

How does this compare to langchain agents?