I found a few months ago that the gpt-4 code interpreter is capable of converting a black and white png of a glyph to an svg<p><a href="https://twitter.com/lfegray/status/1678787763905126400" rel="nofollow noreferrer">https://twitter.com/lfegray/status/1678787763905126400</a><p>It would be cool to combine a script like the one gpt-4 gave me with an image generation model to generate fonts. The approach from this blog post is way more interesting though.<p>On a separate note it reminds me of this suckerpinch video :) maybe we can finally get uppestcase and lowestcase fonts<p><a href="https://www.youtube.com/watch?v=HLRdruqQfRk">https://www.youtube.com/watch?v=HLRdruqQfRk</a>
Douglas Hofstader, the author of Godel Escher Bach, thought the task of creating fonts could only be solved with general AI.<p><a href="https://www.m-u-l-t-i-p-l-i-c-i-t-y.org/media/pdf/Metafont-Metamathematics-and-Metaphysics.pdf" rel="nofollow noreferrer">https://www.m-u-l-t-i-p-l-i-c-i-t-y.org/media/pdf/Metafont-M...</a><p>The Letter Spirit project aims to model artistic creativity by designing stylistically uniform "gridfonts" (typefaces limited to a grid).
I’ve tried out some work on generating vector fonts too, in the format of Bezier curves and a seq2seq model. The problem was that fonts outputted by ML models were imprecise. Lines were not perfectly parallel, corners were at 89°, and curves were kinked. It’s not too difficult to get fonts that look good enough, but the imperfections are glaring as fonts are normally perfectly precise. These imperfections are evident in OP’s output too, and in my opinion make these types of models unusable for actual typesetting.<p>A 1% error in a raster output would be pixel colors being slightly off, but a 89° corner in a vector image is immediately noticeable, making this a hard problem to solve. I haven’t looked into this problem too much since, but I’m interested to hear about possible solutions and reading material.
I think this approach isn't ideal because you're representing pixels as 150x150 unique bins. With only 71k fonts it's likely a lot of these bins are never used, especially at the corners. Since you're quantizing anyways, you might as well use a convnet then trace the output, which would better take advantage of the 2d nature of the pixel data.<p>This kind of reminds me of dalle-1 where the image is represented as 256 image tokens then generated one token at a time. That approach is the most direct way to adapt a causal-LM architecture but it clearly didn't make a lot of sense because images don't have a natural top-down-left-right order.<p>For vector graphics, the closest analogous concept to pixel-wise convolution would be the Minkowski sum. I wonder if a Minkowski sum-based diffusion model would work for svg images.
He he the machine learning naysayers gonna jump on this one for sure.<p>Consider a human being designing a scifi styled font; how do they get started? By opening references, of course! To examples of other scifi styled fonts that they do not have the rights to, nor will they credit.<p>Also consider another human being designing a scifi styled font; but instead one that is not allowed to reference the work of anybody else, as some argue machine learning models ought to do. This human being has no references to open, they have not seen any scifi media, be it movies or posters or fonts or anything else. How can they create something like this without any reference at all to it?<p>If a human being creates a scifi font, and their inspiration is not references to other scifi fonts but instead, I don't know, a general concept of the "vibe" they got from watching Blade Runner, must they credit Blade Runner for the inspiration? Must they pay the owner of the Blade Runner rights for their use of ideas from Blade Runner?
I've long had a project in mind involving the various typefaces of the signage around the city of Vienna, which I find very inspiring in many cases.<p>The idea is to just take a picture of every different typeface I can find, attached to the local buildings at street level.<p>There are some truly wonderful typefaces out there, on signage dating back to last century, and I find the aesthetics often quite appealing.<p>With this tool, could I take a collection of the various typefaces I've captured, and get it to complete the font, such that a sign that only has a few of the required characters could be 'completed' in the same style?<p>Because if so, I'm going to start taking more pictures of Vienna's wonderful types ..
Hmmm. The model is a ckpt instead of a safetensor.<p>Pondering on whether to keep proceeding trying this out or not...<p>EDIT: a scan with picklescan[0] found nothing.. exciting.<p>[0] <a href="https://github.com/mmaitre314/picklescan">https://github.com/mmaitre314/picklescan</a>
OK, that's cool, but those fonts are all terrible. The serifs are all different sizes and shapes, sometimes on the same letter. The kerning looks like a random walk. The stroke widths are all over the place, and/or the hinting is busted.<p>Now, that said, it's pretty amazing that this works at all, but it'll take some pretty specific training on a model to get something that can compete with a human made font that's curated for good usability _and_ aesthetics.<p>Sadly, we'll also probably see adoption of these kinds of fonts (along with graphic design, illustration, songwriting, screenwriting, etc)... because "meh, good enough" combined with some Dunning-Kruger.<p>TL;DR: Thanks, I hate it.
Kinda funny how it works well at this whereas diffusion models go to die when it comes to drawing text but of course it works in a completely different manner.
Okay I can't try it out anyway. "Blocksparse is not available: the current GPU does not expose Tensor cores"<p>My "best" GPU is an RTX 2070 Super, Turing architecture.<p>I've seen similar messages when using stable-diffusion... either with -webui or with automatic, can't exactly remember, but they both run fine on that RTX 2070 Super, so I can only guess that they revert to some other method than Blocksparse on seeing that it doesn't support Turing. Or something. I haven't looked into how they deal with it.<p>I've submitted an Issue [0] for it. I don't have enough knowledge to know if there's some way of saying "don't use Blocksparse" for fontogen.<p>[0] <a href="https://github.com/SerCeMan/fontogen/issues/2">https://github.com/SerCeMan/fontogen/issues/2</a>
Although I would be sad to see the handcrafting that goes into designing custom fonts go, some iterations down the line a model like this would greatly aid tedious glyph alignment and consistency tasks when designing CJK, hiragana, katakana and kanji fonts. Inspiring stuff.
Neat! Does it have prompt capabilities for things like FVAR, GSUB, and GPOS? E.g. "okay now include a many-to-one ligature that turns the word 'chicken' into an emoji of a chicken in the same style" or "now make a second, sans-serif, robotic style and add an axis called interpol that varies the font from the style we just made to this new style"?
This is interesting but i think generating the next letter from the letters before may not be the best way to do it. As you mentioned they degrade with each letter.<p>Maybe creating one long image of a whole font would work better.<p>edit: in the above am misunderstanding what is happening here.<p>But i still think there must be another way to structure this so the attention mechanism doesn't have to work so hard.
Designing fonts for languages that use Chinese characters is often challenging due to the sheer number of glyphs.<p>This approach to generating fonts is very interesting… feels like it could unlock the creation of heavily stylized fonts that just wouldn’t be feasible otherwise.
In honor of all the times he pressed his hands into his eyes (and myself doing the same thing):<p>I present: “Perplexed” by Nilsa. [0]<p>I have a print in my office, in lieu of a mirror.<p>[0] <a href="https://www.sargentsfineart.com/img/nisla/all/nisla-perplexed.jpg" rel="nofollow noreferrer">https://www.sargentsfineart.com/img/nisla/all/nisla-perplexe...</a>
"Fucking Hell" - first thing I yelled to myself when I saw that headline<p>Kudos for the project, of course, but it just saddens me a bit more. Nothing is sacred anymore.