this post was submitted on 07 Nov 2025
60 points (91.7% liked)

Selfhosted

52803 readers
724 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

Hiya,

Recently upgraded my server to an i5-12400 CPU, and have neen wanting to push my server a bit. Been looking to host my own LLM tasks and workloads, such as building pipelines to scan open-source projects for vulnerabilities and insecure code, to mention one of the things I want to start doing. Inspiration for this started after reading the recent scannings of the Curl project.

Sidenote: I have no intention of swamping devs with AI bugreports, i will simply want to scan projects that i personally use to be aware of its current state and future changes, before i blindly update apps i host.

What budget friendly GPU should i be looking for? Afaik VRAM is quite important, higher the better. What other features do i need to be on the look out for?

top 30 comments
sorted by: hot top controversial new old
[–] Diplomjodler3@lemmy.world 58 points 1 day ago (2 children)

The budget friendly AI GPUs are in the shelf right next to the unicorn pen.

[–] breadsmasher@lemmy.world 11 points 1 day ago

Ooh do they have any magic beans? Im looking to trade a cow for some

[–] irmadlad@lemmy.world 3 points 23 hours ago (1 children)

I've self hosted a few of the bite sized LLM. The thing that's keeping me from having a full blown, self hosted AI platform is my little GeForce 1650 just doesn't have the ass to really do it up right. If I'm going to consult with AI, I want the answers within at least 3 or 4 minutes, not hours. LOL

[–] Diplomjodler3@lemmy.world 7 points 22 hours ago (1 children)

Quite so. The cheapest card that I'd put any kind of real AI workload on is the 16GB Radeon 9060XT. That's not what I would call budget friendly, which is why I consider a budget friendly AI GPU to be a mythical beast.

[–] irmadlad@lemmy.world 1 points 19 hours ago

It would be super cool tho, to have a server dedicated to a totally on premise AI with no connectivity to external AI. I'm just not sure whether I can justify several thousand dollars of equipment. Because if I did, true to my nature, I'd want to go all in.

[–] comrade_twisty@feddit.org 21 points 1 day ago* (last edited 1 day ago)

Afaik the budget friendliest local AI solutions currently are Mac Minis! Due to the CPU/GPU/RAM unified structure they are powerhouses for AI and astonishingly well priced for what they can put out.

[–] higgsboson@piefed.social 15 points 1 day ago (1 children)

IMO, a 5060 Ti 16gb is among the best values at $430.

[–] troed@fedia.io 5 points 1 day ago

Agree, this is exactly what I went with recently in the same situation.

[–] chazwhiz@lemmy.world 11 points 1 day ago (1 children)

Take a look at https://www.localscore.ai/. It helped me understand just what the difference in experience will be like.

[–] pjusk@lemmy.dbzer0.com 1 points 21 hours ago

Great resource, thanks for sharing!

[–] drkt@scribe.disroot.org 10 points 1 day ago (2 children)

It's all VRAM, that's the bottleneck for even the best GPUs. AMD support is spotty so you should stay in Nvidia's claws unless you know what you're doing. Figure out what kind of money you're willing to part with, and then get whatever Nvidia GPU gets you the most VRAM.

[–] meldrik@lemmy.wtf 5 points 1 day ago (1 children)

The size of the LLM should be less than the amount of VRAM available.

[–] Jakeroxs@sh.itjust.works 4 points 22 hours ago

Learning about quant level was helpful as well

[–] anamethatisnt@sopuli.xyz 4 points 1 day ago

Yeah, for a budget friendly AI GPU I would look for a 5060 Ti 16GB.

[–] afk_strats@lemmy.world 9 points 1 day ago* (last edited 1 day ago)

3090 24gb ($800 USD)

3060 12gb x 2 if you have 2 pcie slots (<$400 USD)

Radeon mi50 32gb with Vulkan (<$300 ) if you have more time, space, and will to tinker

[–] domi@lemmy.secnd.me 9 points 21 hours ago

Not sure if it counts as "budget friendly" but the best and cheapest method right now to run decently sized models is a Strix Halo machine like the Bosgame M5 or the Framework Desktop.

Not only does it have 128GB of VRAM/RAM, it sips power at 10W idle and 120W full load.

It can run models like gpt-oss-120b or glm-4.5-air (Q4/Q6) at full context length and even larger models like glm-4.6, qwen3-235b, or minimax-m2 at Q3 quantization.

Running these models is otherwise not currently possible without putting 128GB of RAM in a server mainboard or paying the Nvidia tax to get a RTX 6000 Pro.

[–] snekerpimp@lemmy.world 7 points 1 day ago* (last edited 1 day ago)

Everyone is mentioning nvidia, but amds rocm has improved tremendously in the last few years, making a 6900xt 16gb an attractive option for me. I currently have a 6700xt 12gb that works no problem with ollama and comfyui, and an instinct mi25 16gb that works with some fiddling as well. From what I understand, an mi50 32gb requires less fiddling. However the instinct line is passively cooled, so finding a way to cool it might be a reason to stay away from them.

Edit: I should add, my experience is on a few Linux distributions, I can not attest to the experience on windows.

[–] state_electrician@discuss.tchncs.de 6 points 1 day ago (1 children)

I heard about people using multiple used 3090s in a single motherboard for this. Apparently it delivers a lot of bang for the buck, as compared to a single card with loads of VRAM.

[–] Lumisal@lemmy.world 2 points 23 hours ago (1 children)

Yes but you have to find the now kinda rare used NVLink ideally

[–] MalReynolds@slrpnk.net 2 points 20 hours ago

Nah, NVLink is irrelevant for inference workloads (inference nearly all happens in the cards, models are split up over multiple and tokens are piped over pcie as necessary), mildly useful for training but you'll get there without them.

[–] eager_eagle@lemmy.world 5 points 1 day ago* (last edited 1 day ago)

Intel has some GPUs that are more cost effective than NVIDIA's when it comes to VRAM.

Arc A770 is selling for $370 in the US, and the new B50 for $399, both with 16GB.

B60 has 24GB, but I'm not sure where to find it.

[–] papertowels@mander.xyz 4 points 20 hours ago (1 children)

What does budget friendly mean to you?

[–] melroy@kbin.melroy.org 4 points 19 hours ago
[–] poVoq@slrpnk.net 3 points 1 day ago

Recent models run surprisignly well on CPUs if you have sufficient regular RAM. You can also use a low VRAM GPU and offload parts to the CPU. If you are just starting out and want to play around I would try that first. 64gb system RAM is a good amount for that.

[–] dieTasse@feddit.org 3 points 21 hours ago* (last edited 21 hours ago)

I bought used rtx 2060 12 Gigz vram edition for about 150 bucks and it runs pretty well. 4B models run well, I even ran 12B models and while its not the fastest experience, its still decent enough. Sad truth is that nvidia gpus are miles better than any other cards for ai even running linux.

[–] InnerScientist@lemmy.world 3 points 16 hours ago

i will simply want to scan projects that i personally use to be aware of its current state and future changes, before i blindly update apps i host.

If you're just doing this for yourself then you still need to know the programming languages involved, what kind of vulnerabilities exist, how to validate them and quite a bit of how the projects operate.

The AI will output a lot of false positives and you will need to actually know if any of the "vulnerabilities" are valid or just hallucinations. Do you really want that extra workload?

[–] MTK@lemmy.world 2 points 18 hours ago* (last edited 18 hours ago)

Buying new: Basically all of the integrated memory units like macs and amd's new AI chips, after that any modern (last 5 years) gpu while focusing only on vram (currently nvidia is more properly supported in SOME tools)

Buying second hand: not likely to find any of the integrated memory stuff, so any GPU from the last decade that is still officially supported and focusing on vram.

8gb is enough to run basic small models, 20+ for pretty capable 20-30b models, 50+ for the 70b ones and 100-200+ for full sized models.

These are rough estimates, do your own research as well.

For the most part with LLMs for a single user you really only care about VRAM and storage speed(ssd) Any GPU will perform faster than you can read for anything that fully fits on it's VRAM, so the GPU only matters if you intend on running large models at extreme speeds (for automation tasks, etc) And the storage is a bottleneck at model load, so depending on your needs it might not be that big of an issue for you, but for example with a 30gb model you can expect to wait 2-10 minutes for it to load into the vram from an HDD, about 1 minute with a sata SSD, and about 4-30 seconds with an NVMe.

[–] marud@piefed.marud.fr 1 points 1 day ago (1 children)

Don't forget that the "budget friendly" card cost does not include the "non budget friendly" power bill that goes with it.

[–] frongt@lemmy.zip 1 points 11 hours ago

Only if you're using it a lot. At idle or turned off it's negligible.

[–] nutbutter@discuss.tchncs.de 1 points 2 hours ago

RTX 3060 with 12gb vram. Cheap, when bought used, and larger VRAM when compared to other options in this price range, but it will be slow. Or if you can find one, go for Arc B50 16GB VRAM, very power efficient.