High latency on disk 0

Hi,

One of my servers is experiencing extremely high latency on ada0, the problem occurs very often (almost every minute).

Code:
dT: 1.001s  w: 1.000s  filter: ada[0-9]$
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
   10     23      1      4   1543     22    591  571.2  104.4| ada0
    0      0      0      0    0.0      0      0    0.0    0.0| ada1
    0      0      0      0    0.0      0      0    0.0    0.0| ada2
    0      3      3     12    0.1      0      0    0.0    0.0| ada3
    0    125    125   1026    0.1      0      0    0.0    1.7| ada4

It is running FreeBSD 9.0 + ZFS v28.

I did replace the disk but it didn't solve the problem after it resilvered. I did move the data to another identical machine but the other one is running fine.

diskinfo -cvt on four disks is showing consistent performance. All are getting very similar results, so I assume that the problem is not due to the SATA cable or interface.

Is there anything wrong on zfs or FreeBSD? Or possibly the SATA cable?

Below is my server spec and zpool status:
Code:
  pool: vol
 state: ONLINE
 scan: scrub repaired 0 in 7h34m with 0 errors on Wed Jun 13 20:00:30 2012
config:

        NAME        STATE     READ WRITE CKSUM
        vol         ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada1p3  ONLINE       0     0     0
            ada0p3  ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            ada2p3  ONLINE       0     0     0
            ada3p3  ONLINE       0     0     0
        cache
          ada4      ONLINE       0     0     0

errors: No known data errors

  pool: zroot
 state: ONLINE
 scan: scrub repaired 0 in 0h3m with 0 errors on Wed Jun 13 12:26:49 2012
config:

        NAME        STATE     READ WRITE CKSUM
        zroot       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada1p2  ONLINE       0     0     0
            ada0p2  ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            ada2p2  ONLINE       0     0     0
            ada3p2  ONLINE       0     0     0


# zdb | grep ashift
            ashift: 12
            ashift: 12
            ashift: 12
            ashift: 12

- Intel Xeon 5335 Quad Core
- 8GB RAM
- 4 X 1TB Seagate HDD
 
SirDice said:
What's on ada0p1?

My money is on the bootsector.

@belon_cfy

Maybe try watching top and zpool iostat -v at the same time to see what process is generating the IO?

/Sebulon
 
Sebulon said:
My money is on the bootsector.

@belon_cfy

Maybe try watching top and zpool iostat -v at the same time to see what process is generating the IO?

/Sebulon

Hi Sebulon,

It is only serving NFS service. No suspicious process in top either.

The latency randomly increased and causing it freeze for few seconds. Have you encountered similar cases before?
 
Try swapping cables/SATA ports with another drive, say ada2. Then monitor gstat(8) output to see if the latency stays with the drive (dead/dying drive?) or with the port (dead/dying port/controller?).
 
belon_cfy said:
Hi Sebulon,

It is only serving NFS service. No suspicious process in top either.

The latency randomly increased and causing it freeze for few seconds. Have you encountered similar cases before?

How about:
# top -aS
Can you see anything more interesting?


What was it about NFS and 9.0... Memory, bug, something, dark side...

Maybe you are IO-bound. SATA II or SAS I is good for about 250-300 MB/s, which is what about two modern hard drives can saturate, when ZFS flushes IO. Getting warm?

/Sebulon
 
t1066 said:
How full is the pool? Could this be a problem of fragmentation?

Only 34% in use.

Code:
zpool list
NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
vol    1.77T   629G  1.15T    34%  1.00x  ONLINE  -
zroot  39.8G  2.62G  37.1G     6%  1.00x  ONLINE  -

Possibly caused by AHCI?
 
Back
Top