21e23
![]() |
|
|
|
|
|||||||
| Storage Place to ask questions about partitioning, labelling, filesystems, encryption or anything else related to storage area. |
![]() |
|
|
Thread Tools | Display Modes |
|
#1
|
|||
|
|||
|
Hi all,
Hopefully you can help. I have four Samsung 2TB drives in a RAIDZ array. They are given to zfs as complete disks, meaning their stripe offset is 0. Code:
diskinfo -v /dev/ada3
/dev/ada3
512 # sectorsize
2000398934016 # mediasize in bytes (1.8T)
3907029168 # mediasize in sectors
4096 # stripesize
0 # stripeoffset
3876021 # Cylinders according to firmware.
16 # Heads according to firmware.
63 # Sectors according to firmware.
S2H7J1CB702251 # Disk ident.
Straight dd from the raw disks: Code:
dd if=/dev/ada1 of=/dev/null bs=1m count=10000 & dd if=/dev/ada2 of=/dev/null bs=1m count=10000 & dd if=/dev/ada4 of=/dev/null bs=1m count=10000 & dd if=/dev/ada6 of=/dev/null bs=1m count=10000 & gives: 10000+0 records out 10485760000 bytes transferred in 72.958848 secs (143721568 bytes/sec) 10000+0 records in 10000+0 records out 10485760000 bytes transferred in 75.176198 secs (139482446 bytes/sec) 10000+0 records in 10000+0 records out 10485760000 bytes transferred in 75.296263 secs (139260032 bytes/sec) 10000+0 records in 10000+0 records out 10485760000 bytes transferred in 75.555696 secs (138781860 bytes/sec) dd through the filesystem: Code:
$ dd if=/storage/temp/test.file of=/dev/null bs=1m count=10000 10000+0 records out 10485760000 bytes transferred in 57.488728 secs (182396799 bytes/sec) Straight dd to one of the disks (same disk type, but uninited): Code:
$ dd if=/dev/zero of=/dev/ada3 bs=1m count=10000 10000+0 records in 10000+0 records out 10485760000 bytes transferred in 77.051038 secs (136088498 bytes/sec) dd to the pool Code:
$ dd if=/dev/zero of=/storage/temp/test.file bs=1m count=10000 10000+0 records in 10000+0 records out 10485760000 bytes transferred in 191.702491 secs (54698089 bytes/sec) CPU barely registers - the system thinks it is 95+% idle (CPU is an AMD 630 quad core). Any ideas? Last edited by DutchDaemon; May 1st, 2012 at 01:38. |
|
#2
|
|||
|
|||
|
Since you have bad performance when using dd, most probably you have those advanced format drives. The following FAQ should be helpful.
https://forums.freebsd.org/showthread.php?t=21644 |
| The Following User Says Thank You to t1066 For This Useful Post: | ||
arad85 (May 2nd, 2012) | ||
|
#3
|
|||
|
|||
|
Thanks. Is there any way to get the drives to appear as 4096 byte block devices without destroying the ZFS pool? I think I have enough spare space to move stuff around but....
|
|
#4
|
|||
|
|||
|
No, you cannot. Maybe you could first check that these drives are really 4k drives emulating 512 bytes before making the plunge.
|
|
#5
|
|||
|
|||
|
Yes, they are, so I took the plunge.
bonnie++ benchmarks from before the rework (I have 10G of memory in the server): Code:
# bonnie++ -d /storage/dir -u 0:0 -s 20g
...
Version 1.96 -------Sequential Output------- --Sequential Input-- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
MAINSERVER 20G 140 99 186515 50 109949 29 369 98 295539 34 122.0 6
Latency 150ms 2414ms 1607ms 71019us 728ms 1267ms
Version 1.96 ------Sequential Create------ --------Random Create--------
MAINSERVER -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 25232 91 +++++ +++ 21512 97 19158 71 +++++ +++ 24000 85
Latency 9114us 194us 251us 22049us 107us 489us
Code:
# bonnie++ -d /storage/dir -u 0:0 -s 20g
...
Version 1.96 -------Sequential Output------- --Sequential Input-- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
MAINSERVER 20G 151 99 276372 74 151062 39 376 99 383933 45 122.4 6
Latency 88207us 746ms 1049ms 37431us 210ms 1147ms
Version 1.96 ------Sequential Create------ --------Random Create--------
MAINSERVER -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 18532 70 +++++ +++ 24457 86 26525 96 +++++ +++ 28279 97
Latency 6297us 122us 228us 18907us 82us 212us
Code:
shift=9 shift=12 Gain Block write 186M 276M 48% faster Block rewrite 110M 151M 37% faster Block read 295M 384M 30% faster I'm just restoring the data from the single disk and whilst I'm not able to read the data at 100% speed (I think rsync is causing a bottleneck there) the write speeds as stated by gstat seem to be much better (before I was only ever getting 40-50Mbytes/sec per disk writing with the interface maxed out, now I'm getting 100+Mbytes/sec and the interface isn't maxed out. Hopefully, real world performance will improve now too (most of my writing is files to/from SMB shares). Last edited by DutchDaemon; May 2nd, 2012 at 16:53. |
| The Following User Says Thank You to arad85 For This Useful Post: | ||
coppermine (May 2nd, 2012) | ||
|
#6
|
|||
|
|||
|
For completeness, these are the commands I used to rework the array:
# gpart create -s gpt ada1# gpart create -s gpt ada2# gpart create -s gpt ada4# gpart create -s gpt ada6# gpart add -t freebsd-zfs -l disk1 -b 2048 -a 4k ada1# gpart add -t freebsd-zfs -l disk2 -b 2048 -a 4k ada2# gpart add -t freebsd-zfs -l disk3 -b 2048 -a 4k ada4# gpart add -t freebsd-zfs -l disk4 -b 2048 -a 4k ada6# gnop create -S 4096 /dev/gpt/disk1# gnop create -S 4096 /dev/gpt/disk2# gnop create -S 4096 /dev/gpt/disk3# gnop create -S 4096 /dev/gpt/disk4# zpool create storage raidz /dev/gpt/disk1.nop /dev/gpt/disk2.nop /dev/gpt/disk3.nop /dev/gpt/disk4.nop# zpool export storage# gnop destroy /dev/gpt/disk1.nop# gnop destroy /dev/gpt/disk2.nop# gnop destroy /dev/gpt/disk3.nop# gnop destroy /dev/gpt/disk4.nop# zpool import storage
|
| The Following 6 Users Say Thank You to arad85 For This Useful Post: | ||
coppermine (May 2nd, 2012), jalla (May 2nd, 2012), kpa (May 2nd, 2012), rabfulton (July 10th, 2012), thethirdnut (December 3rd, 2012), wblock@ (May 2nd, 2012) | ||
|
#7
|
|||
|
|||
|
A nice and complete post containing clean presentation of a problem, analysis and solution. I would encourage you to drop a small memo what has been done, since many people will be concerned by this or related. Keep up the good work.
|
|
#8
|
|||
|
|||
|
Post #6 says it all really - from bare disks to an array with 4096 aligned blocks. AFAICT, the commands do:
Running: Code:
# zdb storage | grep ashift
ashift: 12
This can be shown to be true by running gpart show on any drive: Code:
]# gpart show ada1
=> 34 3907029101 ada1 GPT (1.8T)
34 2014 - free - (1M)
2048 3907027080 1 freebsd-zfs (1.8T)
3907029128 7 - free - (3.5k)
|
|
#9
|
|||
|
|||
|
PS. The dd command from /dev/random showed no improvement in performance but more realistic benchmarks (bonnie++) seemed to show significant improvements.
|
|
#10
|
||||
|
||||
|
@arad85
First of all, congratulations! It feels good conquering technology ![]() One quick about gnop though, it`s only necessary on the first drive in every vdev. So in your case with your raidz, you would only need the disk1.nop. But. Quote:
This is your alignment, which is done perfectly: Quote:
4x2TB disk partition help The procedure begins at post #12 and is for a bootable striped mirror pool, which you may change to a different pool layout to better suit your needs. Omit the first partition and bootcoding, plus zpool set bootfs if you don`t want to boot from it. /Sebulon |
| The Following User Says Thank You to Sebulon For This Useful Post: | ||
arad85 (May 2nd, 2012) | ||
|
#11
|
|||
|
|||
|
Quote:
Quote:
Code:
#zpool export mypool #zpool import -d /dev/gpt mypool /*if you are using gpt labels*/ or #zpool import -d /dev/label mypool /*if you are using plain labels*/ Last edited by DutchDaemon; May 2nd, 2012 at 16:54. |
| The Following User Says Thank You to t1066 For This Useful Post: | ||
arad85 (May 2nd, 2012) | ||
|
#12
|
|||
|
|||
|
Quote:
Quote:
Is there any way to get the system to see them as labels without having to rebuild the array (nearly finished transferring back the files, but if I must, I will...)? I'm guessing this may be important if I ever moved the disk array to a different controller (which I did a couple of months ago) as it doesn't then matter how they are physically connected up. |
|
#13
|
||||
|
||||
|
For the pool to search the /dev/gpt directory for labels instead of using the device nodes directly:
Code:
# zpool export poolname # zpool import -d /dev/gpt poolname Last edited by phoenix; May 2nd, 2012 at 21:02. Reason: impot --> import |
| The Following User Says Thank You to phoenix For This Useful Post: | ||
arad85 (May 2nd, 2012) | ||
|
#14
|
|||
|
|||
|
Quote:
Got to love ZFS.....
|
|
#15
|
|||
|
|||
|
Ahh.. I was nearly finished. Yes, that works - thanks all:
Code:
# zpool export storage
# zpool import -d /dev/gpt storage
# zpool status storage
pool: storage
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
gpt/disk1 ONLINE 0 0 0
gpt/disk2 ONLINE 0 0 0
gpt/disk3 ONLINE 0 0 0
gpt/disk4 ONLINE 0 0 0
errors: No known data errors
|
|
#16
|
||||
|
||||
|
@phoenix
impot? Talk about Freudian slip ![]() /Sebulon |
|
#17
|
||||
|
||||
|
Doesn't everyone pay a ZFS tax?
![]() Spelling fixed in original post. |
|
#18
|
|||
|
|||
|
Just as an update to this, I tried running 3x reads from local disks and write to the array simultaneously (all 7 disks are across 2x Adaptec 1430SA controllers) and got this:
Code:
dd if=/dev/ada3 of=/storage/testing bs=1024000 count=10000 & dd if=/dev/ada0 of=/storage/testing1 bs=1024000 count=10000 & dd if=/dev/ada5 of=/storage/testing2 bs=1024000 count=10000 & [1] 15331 [2] 15332 [3] 15333 # 10000+0 records in 10000+0 records out 10240000000 bytes transferred in 112.665802 secs (90888271 bytes/sec) 10000+0 records in 10000+0 records out 10240000000 bytes transferred in 120.971984 secs (84647698 bytes/sec) 10000+0 records in 10000+0 records out 10240000000 bytes transferred in 124.426937 secs (82297292 bytes/sec)
Last edited by DutchDaemon; May 3rd, 2012 at 01:44. |
|
#19
|
|||
|
|||
|
A further update. I scrub this array on a weekly basis and get a cron job to mail me 7 hours (11am) after it started. This is last weeks mail:
Code:
pool: storage
state: ONLINE
scan: scrub in progress since Sun Apr 29 04:00:02 2012
2.74T scanned out of 3.27T at 114M/s, 1h21m to go
0 repaired, 83.73% done
Code:
pool: storage state: ONLINE scan: scrub repaired 0 in 2h35m with 0 errors on Sun May 6 06:35:58 2012 |
![]() |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| ZFS Replacing 512b drives by 4k drives? | kisscool-fr | System Hardware | 3 | February 9th, 2012 18:36 |
| In latest Gnome, CD-ROM drives appear as mass storage drives... | Doctor_Who | GNOME | 2 | January 28th, 2012 14:44 |
| Add more drives to ZFS | atwinix | General | 7 | December 8th, 2010 19:51 |
| [Solved] problems with gpt partitioned drives | wonslung | Installing & Upgrading | 2 | July 24th, 2010 20:20 |
| Two drives | Boxmaker | Peripheral Hardware | 5 | March 8th, 2010 10:13 |