It even fits on a 3060 with turboquant / Q4 at decent speed (40T/s) for ~200$ (: | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

aimxhaisse 54 days ago | parent | context | favorite | on: Accelerating Gemma 4: faster inference with multi-...

It even fits on a 3060 with turboquant / Q4 at decent speed (40T/s) for ~200$ (:

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact