NeuralSVG: An Implicit Representation for Text-to-Vector Generation

782 点作者 lnyan4 个月前

29 条评论

vipshek4 个月前

This is excellent!I think the utility of generating vectors is far, far greater than all the raster generation that's been a big focus thus far (DALL-E, Midjourney, etc). Those efforts have been incredibly impressive, of course, but raster outputs are so much more difficult to work with. You're forced to "upscale" or "inpaint" the rasters using subsequent generative AI calls to actually iterate towards something useful.By contrast, generated vectors are inherently scalable and easy to edit. These outputs in particular seem to be low-complexity, with each shape composed of as few points as possible. This is a boon for "human-in-the-loop" editing experiences.When it comes to generative visuals, creating simplified representations is much harder (and, IMO, more valuable) than creating highly intricate, messy representations.

评论 #42639391 未加载

评论 #42639097 未加载

评论 #42639238 未加载

评论 #42644600 未加载

评论 #42646032 未加载

评论 #42638688 未加载

janalsncm4 个月前

I am a huge fan of this type of incremental generative approach. Language isn’t precise enough to describe a final product, so generating intermediate steps is very powerful.I’d also like to see this in music generation. Tools like Suno are cool but I would much rather have something that generates MIDIs and instrument configurations instead.Maybe this is a good lesson for generative tools. It’s possible to generate something that’s a good starting point. But what people actually want is long tail, so including the capability of precision modification is the difference between a canned demo and a powerful tool.> Code coming soonThe examples are quite nice but I have no idea how reproducible they are.

评论 #42638069 未加载

评论 #42638497 未加载

评论 #42650859 未加载

评论 #42638520 未加载

scosman4 个月前

I’ve been impressed with even applying sonnet to SVGs for animations. This looks like it could be a lot more powerful.Fun example: <a href="https://gist.github.com/scosman/701275e737331aaab6a2acf74a523830" rel="nofollow">https://gist.github.com/scosman/701275e737331aaab6a2acf74a52...</a>

评论 #42648553 未加载

intalentive4 个月前

I’ve always thought that generation of intermediate representations was the way to go. Instead of generating concrete syntax, generate AST. Instead of generating PNG, generate SVG. Instead of generating a succession of images for animation, generate wire frame or rigging plus script.Once you have your IR, modify and render. Once you have your render, apply a final coat of AI pixie dust.Maybe generative models will get so powerful that fine-grained control can be achieved through natural language. But until then, this method would have the advantages of controllability, interoperability with existing tools (like Intellisense, image editors), and probably smaller, cheaper models that don’t have to accommodate high dimensional pixel space.

andy_ppp4 个月前

I’m looking forward to seeing what this makes of Simon Willison’s LLM SVG generation test prompt: “Generate an SVG of a pelican riding a bicycle”.It’s quite amazing the progress we are seeing in AI and it will keep getting better which is somewhat terrifying.

评论 #42639548 未加载

goeiedaggoeie4 个月前

This is very nice.I has to convert a bitmask to svg and was wishing to skip the intermediatary step so looked around for papers about segmentation models outputting svg and found this one <a href="https://arxiv.org/abs/2311.05276" rel="nofollow">https://arxiv.org/abs/2311.05276</a>

zellyn4 个月前

The sketch generation is wild… and apparently comes for free.

airstrike4 个月前

This opens up lots of opportunities for document authoring tools. Really cool stuff, can't wait to try out the code once it's available.

评论 #42643371 未加载

jonathaneunice4 个月前

Nice! Looking forward to similar textual generation of diagrams. (The Pic/Pikchr for the LLM age.)

评论 #42638160 未加载

评论 #42645467 未加载

murtio4 个月前

This is really cool! I have been using Claude to animate SVG, and it has been great.

评论 #42637795 未加载

chestervonwinch4 个月前

I wonder if you can use an existing svg as a starting point. I would love to use the sketch approach and generate frame-by-frame animations to plot with my pen plotter.

CyberDildonics4 个月前

If you can generate an image you can flatten it and if you can flatten it you can cluster it, and if you can cluster the flat sections you can draw vectors around them.

评论 #42650559 未加载

Jean-Papoulos4 个月前

This is the kind of image generation I've been waiting for. No more messing around in Inkscape (or at least, less of it) when I need a specific icon.

TeMPOraL4 个月前

Available in ComfyUI when? :).Seriously though, this is amazing, I'm glad to see this tackled directly.Also, I just learned from this thread that Claude is apparently usable for generating SVGs (unlike e.g. GPT-4 when I tested for it some months ago), so I'll play with that while waiting for NeuralSVG to become available.

toisanji4 个月前

This is a group applying vector generation to animations: <a href="https://www.youtube.com/@studyturtlehq" rel="nofollow">https://www.youtube.com/@studyturtlehq</a> The graphic fidelity has been slowly improving over time.

评论 #42638740 未加载

niemandhier4 个月前

It looks as if this is not autoregressive.It would be interesting to see a similar approach that incrementally works from simpler ( fewer curves ) to more complex representations.That way one could probably apply RLHF along the trajectory too.

theckel4 个月前

Does anyone know how this compares to: <a href="https://github.com/ximinng/SVGDreamer">https://github.com/ximinng/SVGDreamer</a> ?

thomasfl4 个月前

Finally something that can benefit artists as a sketching tool.

cyp06334 个月前

Claude has been doing a good job generating SVGs compared to its rivals, happy to see new models bringing image generation even further

shahzaibmushtaq4 个月前

I am really impressed with how it generates rough sketches because everything in the design world begins that way.

nbzso4 个月前

So designers, artist, musicians we are done, right? Who's next, I wonder?

IncreasePosts4 个月前

Shouldn't the girl with the pearl earring have an earring?

评论 #42642051 未加载

piombisallow4 个月前

This is much more useful for actual design jobs.

nikolayasdf1234 个月前

very nice. had this idea for awhile, but never had time to implement it.glad someone actually did it! great work!

kelseyfrog4 个月前

Why does the fourth example show a hamburger but is labeled as a dragon?

评论 #42637155 未加载

评论 #42637158 未加载

评论 #42638306 未加载

评论 #42637451 未加载

pizza4 个月前

Prompting Claude to make SVGs then dropping them into Inkscape and getting the last ~20% of it to match the picture in my head has been a phenomenal user experience for me. This, too, piques my curiosity..!

评论 #42637383 未加载

评论 #42637236 未加载

评论 #42637056 未加载

评论 #42637444 未加载

fosterbuster4 个月前

Its a wasted opportunity not using SVG to show the examples.

评论 #42637718 未加载

1970-01-014 个月前

Aside: I've been having a very hard time prompting ChatGPT to spit out ASCII art. It really seems to not be able to do it.<pre><code> Here is an ASCII art representation of a hopping rabbit: ``` (\(\ ( -.-) o_(")(") ``` This is a simple representation of a rabbit with its ears up and in a hopping stance. Let me know if you'd like me to adjust it!</code></pre>

评论 #42638065 未加载

评论 #42638016 未加载

评论 #42637844 未加载

lbj4 个月前

"Code coming soon" - I hope someone reposts this when there's more to dig into

评论 #42637448 未加载