Technology

76945 readers

4560 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

810

Elon Musk’s Grok Goes Haywire, Boasts About Billionaire’s Pee-Drinking Skills and ‘Blowjob Prowess’ (www.mediaite.com)

submitted 19 hours ago by RandAlThor@lemmy.ca to c/technology@lemmy.world

116 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] DandomRude@lemmy.world 1 points 12 hours ago (6 children)

Ahh, thank you—I had misunderstood that, since Deepseek is (more or less) an open-source LLM from China that can also be used and fine-tuned on your own device using your own hardware.

[–] ranzispa@mander.xyz 1 points 11 hours ago (5 children)

Do you have a cluster with 10 A100 lying around? Because that's what it gets to run deepseek. It is open source, but it is far from accessible to run on your own hardware.

[–] DandomRude@lemmy.world 1 points 10 hours ago (2 children)

Yes, that's true. It is resource-intensive, but unlike other capable LLMs, it is somewhat possible—not for most private individuals due to the requirements, but for companies with the necessary budget.

[–] FauxLiving@lemmy.world 5 points 10 hours ago (1 children)

They're overestimating the costs. 4x H100 and 512GB DDR4 will run the full DeepSeek-R1 model, that's about $100k of GPU and $7k of RAM. It's not something you're going to have in your homelab (for a few years at least) but it's well within the budget of a hobbyist group or moderately sized local business.

Since it's an open weights model, people have created quantized versions of the model. The resulting models can have much less parameters and that makes their RAM requirements a lot lower.

You can run quantized versions of DeepSeek-R1 locally. I'm running deepseek-r1-0528-qwen3-8b on a machine with an NVIDIA 3080 12GB and 64GB RAM. Unless you pay for an AI service and are using their flagship models, it's pretty indistinguishable from the full model.

If you're coding or doing other tasks that push AI it'll stumble more often, but for a 'ChatGPT' style interaction you couldn't tell the difference between it and ChatGPT.

[–] brucethemoose@lemmy.world 1 points 6 hours ago* (last edited 6 hours ago)

You should be running hybrid inference of GLM Air with a setup like that. Qwen 8B is kinda obsolete.

I dunno what kind of speeds you absolutely need, but I bet you could get at least 12 tokens/s.

load more comments (2 replies)