Diagrams AI can, and cannot, generate

214 点作者 billyp-rva2 个月前

23 条评论

diggan大约 2 个月前

A mistake I see people repeating over and over, is never restarting their conversations with a edited initial message.Instead of doing what the author is doing here, and sending messages back and forward, leading to a longer and longer conversation, where each messages leads to worse and worse quality replies, until the LLM seems like a dumb rock, rewrite your initial message with everything that went wrong/was misunderstood, and aim to have whatever you want solved in the first message, and you'll get a lot higher quality answers. If the LLM misunderstood, don't reply "No, what I mean was..." but instead rewrite the first message so it's clearer.This is at least true for all ChatGPT, Claude and DeepSeek models, YMMV with other models.

评论 #43425836 未加载

评论 #43426754 未加载

评论 #43436724 未加载

评论 #43428659 未加载

评论 #43425431 未加载

评论 #43427361 未加载

评论 #43426375 未加载

评论 #43445379 未加载

评论 #43429879 未加载

评论 #43428655 未加载

评论 #43427235 未加载

LASR大约 2 个月前

We use mermaidjs as a supercharged version of chain-of-thought for generating some sophisticated decompositions of the intent.Then we injected the generated mermaid diagrams back into subsequent requests. Reasoning performance improves for a whole variety of applications.

评论 #43420845 未加载

评论 #43421844 未加载

graphviz大约 2 个月前

Random thoughts:Sketching backed by automated cleanup can be good for entering small diagrams. There used to be an iOS app based on graphviz: <a href="http://instaviz.com" rel="nofollow">http://instaviz.com</a>Constraint-based interactive layout may be underinvested, as a consequence of too many disappointments and false starts in the 1980s.LLMs seem ill-suited to solving the optimization of combinatorial and geometric constraints and objectives required for good diagram layout. Overall, one has to admire the directness and simplicity of mermaid. Also, it would be great to someday see a practical tool with the quality and generality of the ultra-compact grid layout prototype from the Monash group, <a href="https://ialab.it.monash.edu/~dwyer/papers/gridlayout2015.pdf" rel="nofollow">https://ialab.it.monash.edu/~dwyer/papers/gridlayout2015.pdf</a> (2015!!)

评论 #43423375 未加载

评论 #43429487 未加载

评论 #43424916 未加载

vunderba大约 2 个月前

Related - a nice time saver that I've been using since they added image recognition support to ChatGPT has been taking a quick snap of my crudely drawn hand sketched diagrams (on graph paper) with my phone and asking ChatGPT to convert them to mermaid UML syntax.

评论 #43425107 未加载

30minAdayHN大约 2 个月前

I was thinking about the similar topic and started to wonder if I can generated a diagram of a large codebase.I thought that LLMs are great at compressing information and thought of putting it to good use by compressing a large codebase into a single diagram. Since entire codebase doesn't fit in the context window, I built a recursive LLM tool that calls itself.It takes two params: * current diagram state, * new files it needs to expand the diagram.The seed set would be an empty diagram and an entry point to source code. And I also extended it to complexity analysis.It worked magically well. Here are couple of diagrams it generated: * <a href="https://gist.github.com/priyankc/27eb786e50e41c32d332390a42e56cd1" rel="nofollow">https://gist.github.com/priyankc/27eb786e50e41c32d332390a42e...</a> * <a href="https://gist.github.com/priyankc/0ca04f09a32f6d91c6b42bd8b180ae4b" rel="nofollow">https://gist.github.com/priyankc/0ca04f09a32f6d91c6b42bd8b18...</a>If you are interested in trying out, I've blogged here: <a href="https://updates.priyank.ch/projects/2025/03/12/complexity-analysis.html" rel="nofollow">https://updates.priyank.ch/projects/2025/03/12/complexity-an...</a>

stared大约 2 个月前

GPT 4o is not particularly good at this kind of logic, at least compared to other current models. Trying something that is at least in the top 10 from this WebDev Areans leaderboard: <a href="https://web.lmarena.ai/leaderboard" rel="nofollow">https://web.lmarena.ai/leaderboard</a> would help.Make sure it is allowed to think before doing (not necessarily in a dedicated thinking mode, it can be a regular prompt to design a graph before implementing it; make sure to add in a prompt who the graph is for (e.g. "a clean graph, suitable for a blog post for technical audience").

McNutty大约 2 个月前

You have got more patience than me. I have tried to use these tools to generate (basic) network diagrams and by the time I reached your third step I already knew that it was time to quit and draw it out myself. Diagrams need to be correct and accurate otherwise they're just art. I also need any amendments to be made to the same diagram, not to have it regenerated each time.I do like the idea of another commenter here who takes a photo of their whiteboard and instructs the AI tool to turn it into a structured diagram. That seems to be well within reach of these tools.

larodi大约 2 个月前

Claude does quite alright. Across one and a half year I did more than several dozens of Mermaid diagrams of all kinds, and only the most complex perhaps were out of reach.It also really depends on the printing.

评论 #43427897 未加载

RKFADU_UOFCCLEL大约 2 个月前

The "AI" we have now is just a tweening algorithm on a different medium. You won't be able to get it to do anything specific, except when that's a point between 2 existing works. As for this blog, it's nigh unreadable for those not following the current fad web frameworks. Who's to say the user doesn't have to log in to get to the gateway? Gateway can mean different things. Why can the user choose to upload images instead of logging in? What was the purpose of the log in?

victorbjorklund大约 2 个月前

I have had good success with D2 diagrams with Claude: <a href="https://victorbjorklund.com/build-diagrams-as-code-with-d2-d2lang" rel="nofollow">https://victorbjorklund.com/build-diagrams-as-code-with-d2-d...</a>They have icons for common things like cloud things.

评论 #43430864 未加载

cadamsdotcom大约 2 个月前

Thanks for writing this up. Some questions for the author:Interesting perspective but it’s a bit incomplete without a comparison of various models and how they perform.Kind of like Simon Willison’s now-famous “pelican on a bicycle” test, these diagrams might be done better by some models than others.Second, this presents a static picture of things, but AI moves really fast! It’d also be great to understand how this capability is improving over time.

评论 #43431440 未加载

submeta大约 2 个月前

Try asking llm to generate plantuml markup (use case, statechart, etc) which has some other diagram types in addition to mermaid markup. Then paste it into the free plantuml renderer. Works pretty well.I also experimented with bpmn markup (xml). Realized there are already repos on GitHub creating bpmn diagrams from prompt.You can also ask llms to create svg.

评论 #43444688 未加载

评论 #43429087 未加载

评论 #43420447 未加载

trash_cat大约 2 个月前

Sonnet 3.7 is perticularly good to generated xml diagrams that can be imported into draw.io. If you are using Cline, Windusurf or Cursor, you can ask it to create the xml file and immediately open it up. Combine it together with CONTEXT.md or ARCHITECTURE.md and you can get a very good overview of the codebase and have discussions around it.

giberson大约 2 个月前

FWIW, I think this article could just as accurately be titled "Diagrams Developers can, and cannot, generate".I'm mainly speaking to the ability to read IaC code ([probably of any library but at LEAST in my case] cdk, pulumi, terraform, cloudformation, serverless) and be able to infer architectural flow from it. It's really not conducive to that use case.I could also, kidding/not kidding, be speaking to the range of abilities for "mid" and "senior" developers to know and convey such flows in diagrams.But really my point is this feels like more validation that AI doesn't provide increased ability, it provides existing (and demonstrated) ability faster with less formalized context. The "less formalized context" is what distinguishes it from programs/code.

ndr_大约 2 个月前

I wrote about the same general topic (or more narrowly: process visualization) in German iX magazine, also available here: <a href="https://www.heise.de/ratgeber/Prozessvisualisierung-mit-generativer-KI-im-Praxistest-10266093.html" rel="nofollow">https://www.heise.de/ratgeber/Prozessvisualisierung-mit-gene...</a> (€)Rather than relying on end-user products like ChatGPT or Claude.ai, this article is based on the „pure“ model offerings via API and frontends that build on these. While the Ilograph blog ponders „AI’s ability to create generic diagrams“, I‘d conclude: do it, but avoid the „open“ models and low-cost offerings.

enoeht大约 2 个月前

Have more success with asking for a detailed workflow print then a d2/mermaid output. No problems with creating a ASCI diagram either and using that for a manual d2 can be done fast enough.

james-bcn大约 2 个月前

Why just stick to Mermaid? I expect that there is a lot more material with regards to SVG that large models have been trained on. And it's a fairly simple format. Asking it to create diagrams in SVG format gives it much more flexibility. Of course there may be a bit less consistency, but there are ways around that (e.g. giving an example/template to follow).Simon Willison has shown that current models aren't very good at creating an SVG of a pelican on a bicycle, but drawing a box diagram in SVG is a much simpler task.

评论 #43421402 未加载

peter_retief大约 2 个月前

I ask AI to generate diagrams in LaTeX, works well for me.

notTooFarGone大约 2 个月前

I used AI to generate some UML diagrams on a loosely coupled system - just fed it the actual classes where only names identify the actual links. It did quite a good job there.It was a well defined domain so I guess the training data argument doesn't fit for stuff that is within a "natural" domain like graphs. LLMs can infer the behavior based on naming quite well.

mulmboy大约 2 个月前

I have found LLMs to be very good at the kind of code -> diagram task presented here. Fire up superwhisper[1] and stream-of-consciousness away about why you want the diagram, which bits are important, who the audience is, and so on. Then iterate a few times. Works brilliantly for even very complex things, including 5000 line CDK files.It's disingenuous to conclude that AI is no good at diagramming after using an impotent prompt AND refusing to iterate with it. A human would do no better with the same instructions, LLMs aren't magic.This is the same as my previous comment <a href="https://news.ycombinator.com/item?id=42524125">https://news.ycombinator.com/item?id=42524125</a>[1] <a href="https://superwhisper.com/" rel="nofollow">https://superwhisper.com/</a>

评论 #43421508 未加载

melagonster大约 2 个月前

Although searching provides better results, it is certain that attempting to copy or directly use these images would infringe on someone's copyright. In a broader sense, using AI can also be related to copyright infringement; the court must first defeat the AI provider before it can reach the user.

WesleyLivesay大约 2 个月前

Given the pace of development in this space, it is probably worth noting in the title that this is from November 2024 so the results might be a bit dated.

评论 #43422221 未加载

jbverschoor大约 2 个月前

[yet]