KingRandomGuy

joined 2 years ago

Apple reveals M3 Ultra, taking Apple silicon to a new extreme in c/technology@lemmy.world

[–] KingRandomGuy@lemmy.world 2 points 5 days ago

This type of thing is mostly used for inference with extremely large models, where a single GPU will have far too little VRAM to even load a model into memory. I doubt people are expecting this to perform particularly fast, they just want to get a model to run at all.

permalink
fedilink
source
context