Real-Time On-Device Diffusion: Practical Acceleration via Fused Low-Bit Kernels
A systems paper on accelerating diffusion inference with fused low-bit kernels and cache-update fusion.
A systems paper on accelerating diffusion inference with fused low-bit kernels and cache-update fusion.
An IEEE Access article on a single-GPU diffusion baseline for text-to-sign language video generation.