I've just deployed two FreeBSD 10.1-R (fully updated to -p5 as of this post) servers which are using two Crucial  960M500 SSDs (running the latest firmware, MU05) in a GMIRROR.  Partitions are aligned to 1 MiB boundaries.  I'm using UFS with TRIM enabled.
When I run
What's even odder is originally before GMIRRORing these disks there was an Adaptec 6504 RAID card and I had the SSDs in a hardware RAID 1 with it and I saw the same issue. There are two servers configured identically and both show the problem.
Here is a typical example of
	
	
	
		
Additionally while this is happening the system can freeze when any other I/O is required, even causing the occasional console message like this:
	
	
	
		
Load also climbs to 4-5.0 during this time (as I'd expect if I/O was backed up).
We've used
Here is the
Read: 172 MiB/sec
Write: 960 MiB/sec
	
	
	
		
System specs:
FreeBSD 10.1-RELEASE-p5 64-bit; Kernel GENERIC
Intel® Xeon® CPU E5-1650 v2 @ 3.50GHz; Hyper-Threaded; 64-bit; 6x Physical Cores
32.0 GiB RAM: 2.7 GiB used / 327.0 MiB cache / 29.0 GiB free
Intel Patsburg AHCI SATA controller: Channel 0 (ahci0:ahcich0)
Things I have tried to resolve the issue:
If I do a DD to test throughput I can get 400 MiB/sec writes, and under "normal" use the system seems fine -- however it's not in production yet so it's not seeing real load. I worry that something is broken and when it does start doing it's job it's going to start freezing (these will become DB servers).
This may very well be some kind of artifact of
Any suggestions/questions/etc welcome. I've got time to continue testing things easily for now.
				
			When I run
 bonnie++ on the volume write/read tests start out fast but degrade.  By the time it hits the file creation phase I can see under  gstat that I/O "busy" is maxed out yet very little throughput.   I would expect this to some degree as it's very random read/writes while it makes files but this "busyness" can take hours to run a test that should take 5-10 minutes.  What's even more interesting is the "busy" will last past the  bonnie++ run, sometimes for hours.What's even odder is originally before GMIRRORing these disks there was an Adaptec 6504 RAID card and I had the SSDs in a hardware RAID 1 with it and I saw the same issue. There are two servers configured identically and both show the problem.
Here is a typical example of
 gstat during this perriod:
		Code:
	
	dT: 1.020s  w: 1.000s  filter: ada[0-9]$
 L(q)  ops/s  r/s  kBps  ms/r  w/s  kBps  ms/w  %busy Name
  29  56  0  0  0.0  0  0  0.0  101.0| ada0
  25  53  0  0  0.0  0  0  0.0  99.2| ada1Additionally while this is happening the system can freeze when any other I/O is required, even causing the occasional console message like this:
		Code:
	
	swap_pager: indefinite wait buffer: bufobj: 0, blkno: 120335, size 28672Load also climbs to 4-5.0 during this time (as I'd expect if I/O was backed up).
We've used
 bonnie++ for years and it's standard for us on new deployments to test the hardware before going into production, I've never see something like this.Here is the
 bonnie++ output.  For and SSD write performance is very low (meaning this issue is very-write specific in my view) and read performance is on par with what I'd expect (roughly 2x a single SSD - yay GMIRROR!).Read: 172 MiB/sec
Write: 960 MiB/sec
		Code:
	
	Version  1.97  ------Sequential Output------ --Sequential Input- --Random-
Concurrency  1  -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine  Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
db1.michigan.ls 64G  796  99 172216  10 99444  7  1313  99 960844  39  3428  27
Latency  31966us  1347ms  15310ms  18280us  9752us  1991ms
Version  1.97  ------Sequential Create------ --------Random Create--------
db1.michigan.ls.pri -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
  files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
  16  1  0  115  0 +++++ +++  2  0 +++++ +++ +++++ +++
Latency  9118s  141s  36us  5462s  9us  19us
1.97,1.97,db1.michigan.ls.privsub.net,1,1423046069,64G,,796,99,172216,10,99444,7,1313,99,960844,39,3428,27,16,,,,,1,0,115,0,+++++,+++,2,0,+++++,+++,+++++,+++,31966us,1347ms,15310ms,18280us,9752us,1991ms,9118s,141s,36us,5462s,9us,19usSystem specs:
FreeBSD 10.1-RELEASE-p5 64-bit; Kernel GENERIC
Intel® Xeon® CPU E5-1650 v2 @ 3.50GHz; Hyper-Threaded; 64-bit; 6x Physical Cores
32.0 GiB RAM: 2.7 GiB used / 327.0 MiB cache / 29.0 GiB free
Intel Patsburg AHCI SATA controller: Channel 0 (ahci0:ahcich0)
Things I have tried to resolve the issue:
- Disabling power management entirely in BIOS
- Disabling powerd
- Disabling SATA agressive link management
- Switching SATA from AHCI to IDE mode
- Disabling VT-d and other virtualization options
If I do a DD to test throughput I can get 400 MiB/sec writes, and under "normal" use the system seems fine -- however it's not in production yet so it's not seeing real load. I worry that something is broken and when it does start doing it's job it's going to start freezing (these will become DB servers).
This may very well be some kind of artifact of
 bonnie++ but the state it puts the system (sometimes for HOURS) is very worrying and I feel like should not be happening.Any suggestions/questions/etc welcome. I've got time to continue testing things easily for now.
 
			     
 
		 
	 
 
		 
 
		


