I looked at this, and thought about it, and then I waited for an hour, and now I looked at it again, and I can't help but think this is useless.<p>We can already weigh parts of prompts, we can already specify colors or styles for parts of the images. And even if we could not, none of this needs rich text.<p>In the beginning I even think their comparisons are dishonest. They compare "plaintext" prompts with "rich text" prompts, but the rich text prompts contain more information. What? Like, seriously, who is surprised the following two prompts give different images?<p>(1) "A girl with long hair sitting in a cafe, by a table with coffee on it, best quality, ultra detailed, dynamic pose."<p>(2) "A girl with long [Richtext:orange] hair sitting in a cafe, by a table with coffee on it, best quality, ultra detailed, dynamic pose. [Footnote:The ceramic coffee cup with intricate design, a dance of earthy browns and delicate gold accents. The dark, velvety latte is in it.]"<p>the worst part is "Font style indicates the styles of local regions". In the comparison with other methods section they actually have to specify in parentheses what each font means style-wise, because nobody knows and (let's be frank) nobody wants to learn.<p>So why not just use these plaintext parentheses in the prompt?<p>I really stopped myself from immediately posting my (rather negative) opinion, but after over an hour, it hasn't changed. As far as i can see, this isn't useful, rich text prompts are a gimmick.