The computational complexity of the diffusion model is very high, which means it may not be applicable in many practical scenarios.<p>If our target image is two-dimensional but essentially a combination of points and lines, with either a single color or a limited number of colors, are there any other frameworks to achieve the conversion between text and image, besides stable diffusion?<p>I can think of the following: 1. Combinatorial optimization methods for generating interior design plans. 2. Methods based on simple neural networks for matching text and images in plane geometry math problem<p>Welcome to add