Kernel panic on 11.1 with Solarflare SFN5162F

Hey,
I realize this may be a hardware issue but I figured I’d post here in case anyone else has come across the same thing.

I've been testing the SFN5162F in my NAS, a Supermicro X8STE motherboard with a Xeon X5650 CPU and 24GB of ECC RAM. FreeNAS 11.1 pre/rc.

Via an AFP share of a ZFS dataset, I can copy many GB of data from my Mac to the NAS with no problems. Both systems are using the same model of card.

When I stop the copy and then try to copy the same data back from the NAS to the Mac, anywhere within 1-16GB of data transferred (randomly), at some point I always get a kernel panic and the system reboots.

It's always on the NAS -> Mac transfer where I get the kernel panic. So far I have not been able to reproduce it in the other direction.

I tried to update the firmware and boot roms to their current versions, and that did not help.
Code:
===
Copyright Solarflare Communications 2006-2015, Level 5 Networks 2002-2005

sfxge0 - MAC: 00-0F-53-08-59-68
    Firmware version:   v6.2.3
    Controller type:    Solarflare SFC9000 family
    Controller version: v3.3.2.1000
    Boot ROM version:   v5.0.0.1002

The Boot ROM firmware is up to date
The controller firmware is up to date

sfxge1 - MAC: 00-0F-53-08-59-69
    Firmware version:   v6.2.3
    Controller type:    Solarflare SFC9000 family
    Controller version: v3.3.2.1000
    Boot ROM version:   v5.0.0.1002

The Boot ROM firmware is up to date
The controller firmware is up to date
===

Here is the panic info log:

====
Dump header from device: /dev/da3p1
  Architecture: amd64
  Architecture Version: 1
  Dump Length: 556032
  Blocksize: 512
  Dumptime: Wed Nov 29 11:27:17 2017
  Hostname: xxxxxxxx.local
  Magic: FreeBSD Text Dump
  Version String: FreeBSD 11.1-STABLE #0 r321665+815c6537f68(freenas/11-stable): Mon Oct 30 22:14:29 UTC 2017
    root@gauntlet:/freenas-11-releng-master/freenas/_BE/objs/freenas-11-releng-master/freenas/_BE/
  Panic String: P2ROUNDUP(addr + 1, etp->et_enp->en_nic_cfg.enc_tx_dma_desc_boundary) >= addr + size
  Dump Parity: 3903987567
  Bounds: 0
  Dump Status: good

===

sfboot output:
===
Solarflare boot configuration utility [v6.2.1]
Copyright Solarflare Communications 2006-2015, Level 5 Networks 2002-2005

sfxge0:
  Boot image                            Option ROM only
    Link speed                          Negotiated automatically
    Link-up delay time                  5 seconds
    Banner delay time                   2 seconds
    Boot skip delay time                5 seconds
    Boot type                           Disabled
  PF MSI-X interrupt limit              32
  SR-IOV                                Disabled
  Virtual Functions on each PF          127
  VF MSI-X interrupt limit              1

sfxge1:
  Boot image                            Option ROM only
    Link speed                          Negotiated automatically
    Link-up delay time                  5 seconds
    Banner delay time                   2 seconds
    Boot skip delay time                5 seconds
    Boot type                           Disabled
  PF MSI-X interrupt limit              32
  SR-IOV                                Disabled
  Virtual Functions on each PF          127
  VF MSI-X interrupt limit              1
====

The panic string / code reference is the same every time.

I opened a support ticket with Solarflare as well, waiting to hear back.

I have not tried older versions of FreeNAS just to see if there’s a difference. I have also not tried swapping the card into a different system.

iperf3 runs with no problems on default settings. I will try to push more data over it to see if I can reproduce it that way as well, but so far the afp copy triggers it every time.

Repeatable only when
Code:
use sendfile
is enabled in afp.conf.

Code:
panic: P2ROUNDUP(addr + 1, etp->et_enp->en_nic_cfg.enc_tx_dma_desc_boundary) >= addr + size
cpuid = 9
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0667e94f20
vpanic() at vpanic+0x186/frame 0xfffffe0667e94fa0
panic() at panic+0x43/frame 0xfffffe0667e95000
siena_tx_qdesc_dma_create() at siena_tx_qdesc_dma_create+0x95/frame 0xfffffe0667e95030
sfxge_tx_qdpl_service() at sfxge_tx_qdpl_service+0x708/frame 0xfffffe0667e95360
sfxge_if_transmit() at sfxge_if_transmit+0x22c/frame 0xfffffe0667e953b0
ether_output() at ether_output+0x6eb/frame 0xfffffe0667e95450
ip_output() at ip_output+0x1308/frame 0xfffffe0667e95580
tcp_output() at tcp_output+0x1a15/frame 0xfffffe0667e95730
tcp_usr_ready() at tcp_usr_ready+0x1e0/frame 0xfffffe0667e95780
sendfile_iodone() at sendfile_iodone+0xe2/frame 0xfffffe0667e957c0
vn_sendfile() at vn_sendfile+0xfdb/frame 0xfffffe0667e95a20
sendfile() at sendfile+0x145/frame 0xfffffe0667e95ac0
amd64_syscall() at amd64_syscall+0xa4a/frame 0xfffffe0667e95bf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0667e95bf0
--- syscall (393, FreeBSD ELF64, sys_sendfile), rip = 0x80450807a, rsp = 0x7fffffffe828, rbp = 0x7fffffffe900 ---
KDB: enter: panic
 
I came across a similar panic when transferring a file on FreeBSD 11.1 over NFS. The nic is 'Solarflare Communications SFC9020 10G Ethernet Controller'. Here is a screenshot,

panic.jpg


Code:
# lspci -vs 82:00.0
82:00.0 Ethernet controller: Solarflare Communications SFC9020 10G Ethernet Controller
        Subsystem: Solarflare Communications SFN5162F-R7 SFP+ Server Adapter
        Flags: bus master, fast devsel, latency 0, IRQ 64
        I/O ports at f100
        Memory at fa000000 (64-bit, non-prefetchable)
        Memory at fb050000 (64-bit, non-prefetchable)
        Expansion ROM at fb020000 [disabled]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [b0] MSI-X: Enable+ Count=32 Masked-
        Capabilities: [d0] Vital Product Data
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Device Serial Number 00-0f-53-ff-ff-0e-20-50
        Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [160] Single Root I/O Virtualization (SR-IOV)

82:00.0 Ethernet controller: Solarflare Communications SFC9020 10G Ethernet Controller
        Subsystem: Solarflare Communications SFN5162F-R7 SFP+ Server Adapter
        Flags: bus master, fast devsel, latency 0, IRQ 64
        I/O ports at f100
        Memory at fa000000 (64-bit, non-prefetchable)
        Memory at fb050000 (64-bit, non-prefetchable)
        Expansion ROM at fb020000 [disabled]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [b0] MSI-X: Enable+ Count=32 Masked-
        Capabilities: [d0] Vital Product Data
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Device Serial Number 00-0f-53-ff-ff-0e-20-50
        Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
 
Back
Top