4 drive zpool, two NVMe, two SSD

I enjoy blogging. I really do. It is quite satisfying. I refer to my blogs for my own use. That is what they started off as, way back when I was trying to get my host running with DHCP on ADSL. Notes to help myself, mostly when seeking help and explaining to others what I did.
It took me years and re-learning many, many things I'd forgotten to learn this lesson. My notes are mostly on a private wiki, though. Maybe I should publish them.
 
It took me years and re-learning many, many things I'd forgotten to learn this lesson. My notes are mostly on a private wiki, though. Maybe I should publish them.
Yes. Yes, you should publish it. What's the downside?
 
There is a SuperMicro solution I found interesting.
U.2 connectors but does not require bifurication. Has some switch onboard.
SLG3-2E4
This worked faster than bifurication with same drives/cables. Not a huge amount but benchmarked and repeated..
I never tested beyond SuperMicro boards. Does work beyond SuperMicro supported list.
Mine tested good on X9/X10 boards.
 
I agree with this. I think a lot of engineers tend to think on paper or whiteboards. I've had jobs where I fill 2 whiteboards and then people get upset when they get erased.
I wish I could be so lucky. For the amount of engineering descisions I have to make on the fly it is not comforting especially when you are talking about $250K repair procedures approved verbally. Nothing in writing.
Maybe my hand-drawn sketches which may or may not be converted to an electronic version for a condition report.. Many clients are so desperate for the boat back in the water that they forgo official docs we used to do for them.
Don't ask me how that goes....
 
Why does this count matter?
It's not so important but it will indicate a possible power issue. In normal operation it will rise only if you reset the computer without a proper shutdown. The write cache on those disk are not battery backed protected so it will be lose on sudden power lost and if the cache is used for write operation all data that is not flushed to the NAND will be lost. That's why is important not to enable the volatile write cache if the system is not protected by UPS and you should never unplug such disk from the hot swap bay before shutdown it first.
 
Code:
[20:52 r730-01 dvl ~] % sudo smartctl -a /dev/da13

Device Model:     Samsung SSD 860 EVO 1TB

Sector Size:      512 bytes logical/physical 

241 Total_LBAs_Written      0x0032   097   097   000    Old_age   Always       -       7136126656590

Let's do the math please. How much data has this SSD written: 7136126656590 * 512 bytes = 3,653,697,005,775,360 bytes or about 3,653.7 TB

According to https://www.samsung.com/us/computin.../ssd-860-evo-2-5--sata-iii-1tb-mz-76e1t0b-am/ the TBW is 600 TBW (someone else please verify).

Have I done this math incorrectly?
 
3,653.7 TB
That sounds like a very large number. Was this drive hosting FreshPorts?

My way of 'checking the math' here would be so slip it in a drive bay of a windows box and use Samsung DiskMagician.
If the drive has that much wear it should show up there. Vendor tools.
Sometimes smartclt uses hex values so you should be wary.
 
That sounds like a very large number. Was this drive hosting FreshPorts?


Yes, but not exclusively and not directly.

Based on https://dan.langille.org/2022/12/31/knew-8/ it was used for the tank_fast01 zpool.

The following is taken from the above URL and modified to keep it shorter.

That tank_fast01/dbclone fileystem was used as a daily testing. It loaded up every database I had. Every day. tank_fast01/dbclone.backups.rsyncer shows it was about 231GB of backups. It was loaded into tank_fast01/dbclone.postgres

The FreshPorts dev and prod databases are included in that. So was my Bacula database which kept track of all the backups, including FreshPorts.

Good news: none of that data on that filesystem was primary. It all originated somewhere else. From here, it was backed up daily to the other server.

Why do this all on SSD and not HDD? Just because it's faster. However, that dbclone process is not time-sensitive. It can run anytime. The current process runs on HDD.

Based on the power-on-hours, I bought these drives about 4.5 years ago. They've led a good life. If I'm going to keep using them, it would need to be for non-valuable data/purposes, IF INDEED they are past their TBW levels.

Code:
[knew dan ~] % zpool list
NAME          SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
tank_fast01   928G   251G   677G        -         -     5%    27%  1.00x    ONLINE  -

[knew dan ~] % zfs list
NAME                                               USED  AVAIL     REFER  MOUNTPOINT
tank_fast01                                        251G   648G       23K  none
tank_fast01/dbclone                               4.94G   648G     4.94G  /usr/jails/dbclone
tank_fast01/dbclone.backups.rsyncer                231G   648G      231G  /jails/dbclone/usr/home/rsyncer/backups
tank_fast01/dbclone.postgres                       254M   648G      254M  /jails/dbclone/var/db/postgres
tank_fast01/empty                                 1.71G   648G       24K  none
tank_fast01/empty/ports                           1.71G   648G     1.71G  /jails/empty/usr/ports
tank_fast01/vm                                    13.3G   648G      346M  /usr/local/vm
tank_fast01/vm/mkjail                             12.9G   648G     12.9G  /usr/local/vm/mkjail
 
Yes you need to unmount it first and if it's part from raid volume eject it from there first.

TBW = 1024GiB * 2074 / 0,624 = 3 403 487GiB
Is that 2074 is from:


Code:
177 Wear_Leveling_Count     0x0013   001   001   000    Pre-fail  Always       -       2074

Where is 0.624 from?

You're saying this SSD has had 3.4 PB written to it?
 
If the smartctl format is correct on this then yes this ssd write ~3.4PB which is only rated as 600TB. Some vendors have different format/usage on the smart values so it may be possible this not to be true.
 
Back
Top