TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Text to Image Synthesis Using Thought Vectors

153 点作者 piyush8311将近 9 年前

6 条评论

gwern将近 9 年前
It&#x27;s a little tricky getting this to work because you need two separate models working together, but I tried it out. Here&#x27;s some of the samples I generated:<p><a href="https:&#x2F;&#x2F;imgur.com&#x2F;Uwp1wfu" rel="nofollow">https:&#x2F;&#x2F;imgur.com&#x2F;Uwp1wfu</a><p><a href="https:&#x2F;&#x2F;imgur.com&#x2F;yuW9Yre" rel="nofollow">https:&#x2F;&#x2F;imgur.com&#x2F;yuW9Yre</a><p><a href="https:&#x2F;&#x2F;imgur.com&#x2F;oZ4wzdC" rel="nofollow">https:&#x2F;&#x2F;imgur.com&#x2F;oZ4wzdC</a> some definite weaknesses in the natural language embedding<p><a href="https:&#x2F;&#x2F;imgur.com&#x2F;MAupphr" rel="nofollow">https:&#x2F;&#x2F;imgur.com&#x2F;MAupphr</a> roses in general don&#x27;t seem to work well. must not have been many in the dataset<p>You can see that it works better than one would expect, but there are definitely limits to the understanding. The flower and COCO datasets are, ultimately, not that big. What would be exciting is if you could train it on some extremely large and well-annotated dataset like Danbooru.
评论 #12367314 未加载
radarsat1将近 9 年前
I think the idea is interesting but I&#x27;m not convinced it really &quot;synthesizes ideas&quot; so much as treats the neural network like a database of images that it mixes.<p>Now, I could be wrong, but because of the way the results are presented it doesn&#x27;t tell me that it&#x27;s any good at picking up the meaning of the phrase. The results show a single phrase and a set of images it generates. White flower with yellow center, and a bunch of images of white flowers.<p>But if it can synthesize the idea properly, one should be able to generate a flower of a variety of descriptions. Yellow flower with blue center. Red flower with yellow center. Blue flower with black edges and black center. etc..<p>From the way they describe the functionality it should be able to do these things so in a way I don&#x27;t doubt it, but I want to see how it performs on phrases that induce combinations of ideas that are well outside of the training set yet refer to individual ideas within the training set.
评论 #12370728 未加载
failrate将近 9 年前
This is lovely. As a lazy programmer, I would appreciate this as a web service. Instead of googling for an image to steal as placeholder art, I could request a uniquely generated image.
评论 #12367502 未加载
Y_Y将近 9 年前
I can see this being useful for police sketch artists.
评论 #12367146 未加载
评论 #12367262 未加载
评论 #12369211 未加载
评论 #12367510 未加载
viach将近 9 年前
It would be cool to implement text to pizza image synthesis.
评论 #12370889 未加载
ash9r将近 9 年前
What GPU was used to train this model?
评论 #12367898 未加载