
Google Open-Sources DiffusionGemma, Replacing Autoregressive Generation With Text Diffusion for Lower Latency
Google released DiffusionGemma on June 10, an experimental open-weight model that abandons token-by-token autoregressive generation in favor of parallel diffusion-based text generation. The approach generates or refines sequences iteratively, reducing wall-clock latency for long outputs. Text diffusion for language remains an open research area with known challenges around output coherence and instruction-following. DiffusionGemma provides a production-ready baseline for developers to evaluate against autoregressive alternatives in latency-sensitive applications.
Published