ChatGPT's new browser has potential, if you're willing to pay

themachinestops@lemmy.dbzer0.com · edit-2 8 days ago

brucethemoose@lemmy.world · edit-2 7 days ago

I have access to GLM 4.6 through a service but that’s the ~350B parameter model and I’m pretty sure that’s not what you’re running at home.

It is. I’m running this model, with hybrid CPU+GPU inference, specifically: https://huggingface.co/Downtown-Case/GLM-4.6-128GB-RAM-IK-GGUF

You can likely run GLM Air on your 3060 desktop if you have 48GB+ RAM, or a smaller MoE easily. Heck. I’ll make a quant just for you, if you want.

Depending on the use case, I’d recommend ERNIE 4.5 21B (or 28B for vision) on your Macbook, or a Qwen 30B variant. Look for DWQ MLX quants, specifically: https://huggingface.co/models?sort=modified&search=dwq