lemmyvore

joined 2 years ago
[โ€“] lemmyvore@feddit.nl 1 points 10 minutes ago

It's a pain in the butt to swap CPUs one more time but that may pale in comparison to trying to convince the shop that a core is bad and having intermittent faults. ๐Ÿคช

[โ€“] lemmyvore@feddit.nl 1 points 29 minutes ago* (last edited 23 minutes ago) (1 children)

This sounds like my best shot, thank you.

I've installed the amd-ucode package. It already adds microcode to the HOOKS array in /etc/mkinitcpio.conf and runs mkinitcpio -P but I've moved microcode before autodetect so it bundles code for all CPUs not just for the current one (to have it ready when I swap) and re-ran mkinitcpio -P. Also had to re-run grub-mkconfig -o /boot/grub/grub.cfg.

I've seen the message "Early uncompressed CPIO image generation successful" pass by, and lsinitcpio --early /boot/initramfs-6.12-x86_64.img|grep micro shows kernel/x86/microcode/AuthenticAMD.bin, there's a /boot/amd-ucode.img, and an initrd parameter for it in grub.cfg. I've also confirmed that /usr/lib/firmware/amd-ucode/README lists an update for that new CPU (and for the current one, speaking of which).

Now from what I understand all I have to do is reboot and the early stage will apply the update?

Any idea what it looks like when it applies the microcode? Will it appear in dmesg after boot or is it something that happens too early in the boot process?

[โ€“] lemmyvore@feddit.nl 1 points 3 hours ago

BIOS is up to date, CPU model explicitly listed as supported, memtest ran fine, not using XMP profiles.

[โ€“] lemmyvore@feddit.nl 1 points 3 hours ago

All hardware is the same, I'm trying to upgrade from a Ryzen 3100 so everything should be compatible. Both old and new CPU have a 65W TDP.

I'm on Manjaro, everything is up to date, kernel is 6.12.17.

Memory runs at 2133 MHz, same as for the other CPU. I usually don't tweak BIOS much if at all from the default settings, just change the boot drive and stuff like "don't show full logo at startup".

I've add some voltage readings in the post and answered some other posts here.

[โ€“] lemmyvore@feddit.nl 1 points 3 hours ago (3 children)

Everything is up to date as far as I can tell, I did Windows too.

memtest ran fine for a couple of hours, CPU stress test hang up partway through though, while CPU temp was around 75C.

[โ€“] lemmyvore@feddit.nl 2 points 4 hours ago

RAM is indeed at 2133 MHz and the cooling is great, got a tower cooler (Scythe Kotetsu mark II), idle temps are in the low 30's C, stress temp was 76C.

[โ€“] lemmyvore@feddit.nl 2 points 4 hours ago

Motherboard is a Gigabyte B450 Aorus M. It's fully updated and support for this particular CPU is explicitly listed in a past revision of the mobo firmware.

Manual doesn't list any specific CPU settings but their website says stepping A0, and that's what the defaults were setting. Also I got "core speed: 400 MHz", "multiplier: x 4.0 (14-36)".

even some normal batch cpus might sometimes require a bit more (or less) juice or a system tweak

What does that involve? I wouldn't know where to begin changing voltages or other parameters. I suspect I shouldn't just faff about in the BIOS and hope for the best. :/

16
CPU errors? (feddit.nl)
submitted 11 hours ago* (last edited 4 hours ago) by lemmyvore@feddit.nl to c/linux@lemmy.ml
 

I'm trying a new CPU in my PC (Ryzen 5500GT) and I'm seeing:

  • Sporadic kernel panics during boot.
  • Random .ko.zst module files (different one each boot) complaining that ZST decompression failed checksum.
  • Random .so's failing to find a symbol and causing programs to crash/fail to start.
  • Started a stress-ng sequential session at 5s per stressor and it hung up after a dozen stressors. Couldn't ctrl-c it and also ps didn't work anymore. ๐Ÿ˜…

Funny thing is, other than that the system runs fine (when it boots, that is).

Switched back to my old CPU (that's the only change in the machine) and all of these things stopped.

That CPU that's doing that is defective, correct? Just double-checking I'm not missing anything else.

I've reset BIOS between CPU swaps and left it at defaults. Could default settings cause a CPU to act like this?

Edit: cooling is good, all temps (chipset, CPU etc.) are in the 30's C in idle, CPU went up to 75C when stressed. Have a tower cooler (Scythe Kotetsu) with a 120mm fan.

I'm also adding some voltage readings I took from sensors while the problematic CPU was installed:

Vcore: 840mV
+3.3V: 3.31V
+12.0V: 12.10V
+5.0V: 5.01V
VSOC: 780mV
VDDP: 900mV
DRAM: 1.21V
3VSB: 3.29V
VBAT: 3.26V
[โ€“] lemmyvore@feddit.nl 0 points 2 years ago (1 children)

I'm waiting for the day Google Recaptcha will ask me "is that traffic light red?" and after a couple of seconds "hurry up, I'm approaching the intersection!"

[โ€“] lemmyvore@feddit.nl 0 points 2 years ago* (last edited 2 years ago) (3 children)

If by "easy" you mean someone else already spent 5 years and a nice chunk of cash training a model for it, which you get to use. And if you accept that it will not be accurate across all possible species and environments, only very specific subsets.