DeCo: Faster, sharper pixel diffusion by splitting frequencies
Meet DeCo, a new way to generate images directly in pixels - faster and sharper.
Instead of one big model juggling everything, DeCo splits the job by "frequencies": the Diffusion Transformer learns low-frequency stuff (overall shapes and colors), while a lightweight pixel decoder paints high-frequency details (edges and textures). Think bass vs treble, now tuned separately.
- Better focus: A frequency-aware flow-matching loss emphasizes visually important signals and downplays noise.
- Results: ImageNet FID 1.62 (256x256) and 2.22 (512x512) - closing the gap with latent diffusion. Their text-to-image model hits 0.86 on GenEval.
- Open code: https://github.com/Zehong-Ma/DeCo
Takeaway: decoupling low-level details from global structure makes end-to-end pixel diffusion more efficient without sacrificing realism.
Paper: https://arxiv.org/abs/2511.19365v1
Paper: https://arxiv.org/abs/2511.19365v1
Register: https://www.AiFeta.com
#AI #Diffusion #ImageGeneration #ComputerVision #DeepLearning #OpenSource #Research #ImageNet