TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

DALL-E Paper and Code

49 点作者 david2016超过 4 年前

6 条评论

minimaxir超过 4 年前
Note that this is just the VAE component as used to help training and generating images, it will not let you create crazy images with natural language as used in the blog post (<a href="https:&#x2F;&#x2F;openai.com&#x2F;blog&#x2F;dall-e&#x2F;" rel="nofollow">https:&#x2F;&#x2F;openai.com&#x2F;blog&#x2F;dall-e&#x2F;</a>).<p>More specifically from that link:<p>&gt; [...] the image is represented using 1024 tokens with a vocabulary size of 8192.<p>&gt; The images are preprocessed to 256x256 resolution during training. Similar to VQVAE, each image is compressed to a 32x32 grid of discrete latent codes using a discrete VAE1 that we pretrained using a continuous relaxation.<p>OpenAI also provides the encoder and decoder models and their weights.<p>However, with the decoder model, it&#x27;s now possible to say train a text-encoding model to link up to that decoder (training on say an annotated image dataset) to get something close to the DALL-E demo OpenAI posted. Or something even better!
评论 #26257083 未加载
TheRealPomax超过 4 年前
Can someone explain what this even is for folks reading the description and going &quot;this means nothing to me&quot;?
评论 #26258268 未加载
评论 #26258333 未加载
make3超过 4 年前
the title should be updated, this doesn&#x27;t have the paper, and it&#x27;s not the code for DALL-E but for its VAE component only
pikseladam超过 4 年前
I have prepared it in collab. <a href="https:&#x2F;&#x2F;colab.research.google.com&#x2F;drive&#x2F;1KA2w8bA9Q1HDiZf5Ow_VNOrTaWW4lXXG?usp=sharing" rel="nofollow">https:&#x2F;&#x2F;colab.research.google.com&#x2F;drive&#x2F;1KA2w8bA9Q1HDiZf5Ow_...</a>
MrUssek超过 4 年前
So, uhh, where&#x27;s the paper? The link in the readme isn&#x27;t active.
评论 #26259624 未加载
campac超过 4 年前
Has anyone tried this out?
评论 #26257119 未加载
评论 #26256883 未加载