I worked intensely with the LSI SAS HBAs, about 10 years ago. They are wonderful, but also very annoying. They can have both bandwidth and IOps limitations. The bandwidth ones are pretty obvious (they are usually capable of maxing out both SAS and PCI buses if using large IOs and deep queues). We were using them to connect to several hundred disks (with two servers, a few HBAs each), and we were able to max out the servers at about 10-12 GByte/s (including running RAID and checksum codes). That was using Linux, on both Intel and PowerPC platforms.
But: Where the cards can easily fall apart is IOps, which is much harder to understand, tune, and debug. There is an enormous amount of driver work that needs to be done to get great performance out, and when I say "driver", I mean not just the OS driver for the card itself, but the block subsystem above (which can hold long queues of IOs), memory management (after all, every pending IO in the queue has a memory buffer that is pinned for the duration, and latencies can get high when you have deep queues), and the firmware in the card. And without deep queueing (I used to aim for 5-10 IOs pending on each drive at all times, and 20 or 50 is better), you won't get good performance in random IOs. One thing we discovered the hard way is the following: The firmware in the HBA and the OS driver stack have a lot of error handling and recovery built in. If there are incompatibilities between disk and HBA, those might become low-level IO errors, which occur very frequently. Every time an error happens, the HBA wipes its queue and aborts many other IOs (or lets them fail), and then some layer way above automatically retries them. Net result: No error actually make it up to the application layer (because retries cure all ills), but IOps throughput is really bad, because in effect IO is being single-tracked.
Our fix was to work closely with engineers from LSI, the disk manufacturers, and the SAS expander vendors. That included running special diagnostic firmware versions, collecting undocumented statistics, and occasionally having SAS analyzers on the bus. Took months.
Why am I telling this story? I would not be surprised if the IO pattern that ZFS resilvering does can in some cases cause the HBAs to do things that are performance-killing, through making queuing work less well. I wonder whether it would be possible to instrument ZFS on FreeBSD with individual IO performance metrics (traces, or averages), and see how many IOs are queued on each drive, and what the IO latency is (depending on IO size, distance to previous IO finished, and queue depth). Doing this work would be days or weeks of work.
There used to be a guy named Terry Kennedy who did a lot of large IO work on FreeBSD (disk and tape); I know he had a long history of doing the same thing on VAXes under VMS earlier. He might be able to shine some light on this, but I haven't heard from him in years.