themachinestops@lemmy.dbzer0.com to Technology@lemmy.worldEnglish · edit-28 days agoChatGPT's new browser has potential, if you're willing to paywww.bbc.comexternal-linkmessage-square11linkfedilinkarrow-up11arrow-down13file-text
arrow-up1-2arrow-down1external-linkChatGPT's new browser has potential, if you're willing to paywww.bbc.comthemachinestops@lemmy.dbzer0.com to Technology@lemmy.worldEnglish · edit-28 days agomessage-square11linkfedilinkfile-text
minus-squarebrucethemoose@lemmy.worldlinkfedilinkEnglisharrow-up1·edit-27 days ago I have access to GLM 4.6 through a service but that’s the ~350B parameter model and I’m pretty sure that’s not what you’re running at home. It is. I’m running this model, with hybrid CPU+GPU inference, specifically: https://huggingface.co/Downtown-Case/GLM-4.6-128GB-RAM-IK-GGUF You can likely run GLM Air on your 3060 desktop if you have 48GB+ RAM, or a smaller MoE easily. Heck. I’ll make a quant just for you, if you want. Depending on the use case, I’d recommend ERNIE 4.5 21B (or 28B for vision) on your Macbook, or a Qwen 30B variant. Look for DWQ MLX quants, specifically: https://huggingface.co/models?sort=modified&search=dwq
It is. I’m running this model, with hybrid CPU+GPU inference, specifically: https://huggingface.co/Downtown-Case/GLM-4.6-128GB-RAM-IK-GGUF
You can likely run GLM Air on your 3060 desktop if you have 48GB+ RAM, or a smaller MoE easily. Heck. I’ll make a quant just for you, if you want.
Depending on the use case, I’d recommend ERNIE 4.5 21B (or 28B for vision) on your Macbook, or a Qwen 30B variant. Look for DWQ MLX quants, specifically: https://huggingface.co/models?sort=modified&search=dwq