Real-Time On-Device Diffusion: Practical Acceleration via Fused Low-Bit Kernels
A systems paper on accelerating diffusion inference with fused low-bit kernels and cache-update fusion.
A systems paper on accelerating diffusion inference with fused low-bit kernels and cache-update fusion.
A framework for evaluating AI by explanation, contestability, accessibility, and fit—not just accuracy.
Why meaningful human oversight requires more than rubber-stamping machine output.