科技回声

1 comment

Initially I was assuming they were not including the Huffman encoding step, but no:The bytes in the files do not have consistent meanings and would depend on their context and the implicit Huffman tables. [...]However, we observe that conventional, vanilla language modeling surprisingly conquers these challenges without special designs as training goes (e.g., JPEG-LM generates realistic images barely with any corrupted JPEG patches).That surprised me, but then I'm not in the field.

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations

1 comment

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations

1 comment