Not a single new 64GB GPU, but multiple used GPUs.
They’ve significantly increased in price (so much for hardware depreciation…) but you can still get a modded 22GB 2080 ti for $320, or a Mi50 32GB for ~$450 each (used to be $150 a few months ago, alas), or a Mi50 16GB or <$200 but you’d need to stack 4 of them.
There’s also some more exotic configurations but those are probably the simplest options. You won’t get the performance of an RTX Pro 6000 Blackwell of course, and the power consumption will be pretty high so it’s only worth it if you have cheap electricity. But it is possible.
If you are not being paid by Apple, I feel sorry for you. Cause that means you are so bought into the cult that you are delusional.
the 40-80 tok/sec is only for initial prompt processing, and with the "medium" models, like Qwen3.6:27b. The actual token generation is in the 10 token/second Thats very slow. And your Macbook pro will stop being a LAP-top, because it will get very warm.
Meanwhile, my 2x3090s happily crank out ~100 tok/sec generation. Oh and I can run 100 tok/sec on my phone as well, because I can just access ollama on my home desktop over ssh from termux.
Keep it real, Kyle. It doesn't seem like you learned anything from the failure of your last company.
[1] https://weartv.com/news/local/report-pensacola-woman-charged...