DeCo: Faster, sharper pixel diffusion by splitting frequencies

DeCo: Faster, sharper pixel diffusion by splitting frequencies

Meet DeCo, a new way to generate images directly in pixels - faster and sharper.

Instead of one big model juggling everything, DeCo splits the job by "frequencies": the Diffusion Transformer learns low-frequency stuff (overall shapes and colors), while a lightweight pixel decoder paints high-frequency details (edges and textures). Think bass vs treble, now tuned separately.

  • Better focus: A frequency-aware flow-matching loss emphasizes visually important signals and downplays noise.
  • Results: ImageNet FID 1.62 (256x256) and 2.22 (512x512) - closing the gap with latent diffusion. Their text-to-image model hits 0.86 on GenEval.
  • Open code: https://github.com/Zehong-Ma/DeCo

Takeaway: decoupling low-level details from global structure makes end-to-end pixel diffusion more efficient without sacrificing realism.

Paper: https://arxiv.org/abs/2511.19365v1

Paper: https://arxiv.org/abs/2511.19365v1

Register: https://www.AiFeta.com

#AI #Diffusion #ImageGeneration #ComputerVision #DeepLearning #OpenSource #Research #ImageNet

Read more