this post was submitted on 30 Dec 2025
60 points (95.5% liked)

Selfhosted

54004 readers
1352 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

  7. No low-effort posts. This is subjective and will largely be determined by the community member reports.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

Hey everyone,

I recently built my first NAS. It was bough used with SAS hardware. I've finally got past all the roadblocks and problems that were in my way (I basically bricked a whole SAS drive, a hero of a lemmy user helped me fix it).

Now after filling the 15 TB of RAIDZ2 with around 100gb of data. One of the drives started waiving its white flag and wants to die on me.

I am a complete beginner with no experience with these things.

Is my drive dying and should be replaced? or can it be fixed?

This is the output of the 507 errors that TrueNAS received form it and labelled the vDev as degraded and the drive as faulted:

Output of zpool status and sudo smartctl -a /dev/sdd

As a beginner it looks like this drive is cooked, please let me know if it needs replacing so I can order a new one and replace it right away.

Thank you sooo much!

Edit: SAS not SATA drives

top 17 comments
sorted by: hot top controversial new old
[–] BarbecueCowboy@lemmy.dbzer0.com 24 points 17 hours ago (1 children)

An extra perspective, hard drive prices are mostly expected to get even worse next year.

It may hold on for months to years but all drives will likely die eventually and this one is giving you a warning, I'd replace it now and save some money if you can afford it.

[–] scrubbles@poptalk.scrubbles.tech 9 points 14 hours ago* (last edited 14 hours ago)

For everyone here, this is true, and I'd recommend buying a couple drives for quick replacements. If you don't you're gambling that 1) a replacement drive will be available and 2) that it will be affordable.

Keep a few spares lying around while we get through the ebbs and flows of the market. As K said, a person is smart, people are dumb and panicky. If ssd and storage prices rise, people panic and our hdds will rise too.

Prep now and thank yourself later.

[–] non_burglar@lemmy.world 18 points 18 hours ago

zpool has very reasonable thresholds for disk failure being enough to kick it from the pool. I've seen pool members have a batch of bad blocks and ZFS still chugged along for a few years just avoiding those blocks before the disk finally failed.

Heed truenas here, replace the disk if you can.

[–] frongt@lemmy.zip 13 points 17 hours ago (2 children)

Not necessarily. I would shut the system down completely and check the drive connectors. If it's on a backplane, try swapping slots, or if it's breakout connector, swap it with another drive (and clear the zpool errors). If the errors start happening on the other drive, it's a cable problem. If they continue on the same drive, it's a drive problem. If they stop happening, it was a bad connection and it ought to be fine now.

That's kind of a short output from smartctl -a, though. Shouldn't it include the attribute data? I'd run a smart test (after doing the swap above) and see what it says.

On a raidz2, I wouldn't be too concerned about losing a drive, but you should always be prepared to order a replacement if you value your data.

[–] s38b35M5@lemmy.world 4 points 16 hours ago (2 children)

I second this. SATA cables are cheaply made and can present issues that seem to indicate drive failure.

[–] BigDaddySlim@lemmy.world 2 points 15 hours ago

Had this issue once, 2 drives kept not initializing during boot, rebooting a few times got them to register but showed drive errors. I thought either the drives or my SAS card was dying. Fully reseating the connectors fixed it and haven't had an issue since.

[–] frongt@lemmy.zip 1 points 15 hours ago

OP is using SAS, but it's not too far from SATA.

[–] ragebutt@lemmy.dbzer0.com 4 points 14 hours ago

I would bet money that drive is done. Cable would be udma crc errors, not media failure. Drive made it 11 years (even if power on time is only about half of that)

[–] CubitOom@infosec.pub 3 points 18 hours ago* (last edited 18 hours ago) (2 children)

42441.67 powered on hours is really young for drive death, I normally don't start seeing issues till 50k.

Is it making any audible sounds while running?

I haven't read the smart data for Seagate in a while but the errors numbers look off and would like to see more details. Haveing errors in itself doesn't mean much. Normally I look for Reallocated Sector Count, Seek Error Rate, and Uncorrectable Sector Count. But here it's not telling the details like the types of errors.

Maybe try sudo smartctl -all /dev/sdd not sure if the -a you used is being interpreteded correctly?

[–] suzune@ani.social 1 points 14 hours ago

I'm also missing the smart tabular output here, because it's easier to read and allows to inspect the source of the errors. Maybe it's because it's SAS?

[–] ryannathans@aussie.zone 0 points 15 hours ago* (last edited 15 hours ago) (2 children)

sudo? -all?

I'm going to assume truenas is entirely Linux based now because last time I worked on it, it was FreeBSD and it was smartctl -x /dev/ada3 as root

[–] eli@lemmy.world 2 points 4 hours ago* (last edited 4 hours ago)

Kind of, TrueNAS has "CORE" which is FreeBSD and "SCALE" which is Linux.

If you're on CORE 13.X you can actually side-grade over to SCALE.

CORE is in maintenance only and SCALE is the path forward. So you can still get some updates on CORE I think, but everyone should be switching over to SCALE or using SCALE from here on out.

[–] suzune@ani.social 0 points 14 hours ago (1 children)

On FreeBSD it's also -a or -A for shorter output.

[–] ryannathans@aussie.zone 0 points 11 hours ago

-a is limited for legacy compatibility, -x is the new all

[–] Lemmchen@feddit.org 3 points 8 hours ago

Does this drive not have SMART values? Could be a loose cable for all we know (UDMA CRC errors).

Just RMA it now. If it has SMART failures, you can provide the codes and they'll replace it no problem.

[–] possiblylinux127@lemmy.zip 1 points 16 hours ago

Buy two replacements