Random system crash with USB Audio

Hi everyone. I have a FiiO Andes USB DAC that I use on my workstation, connected to a USB hub. The DAC is successfully recognized by the system, and I can play and listen to sound just fine.

Code:
[kjohnson@kyle_workstation01 ~]$ dmesg|tail -n20
uhub_attach: HUB at depth 6, exceeds maximum. HUB ignored
device_attach: uhub13 attach returned 6
ugen2.9: <vendor 0x0451 product 0x8440> at usbus2
uhub13 on uhub12
uhub13: <vendor 0x0451 product 0x8440, class 9/0, rev 3.10/1.00, addr 8> on usbus2
uhub_attach: HUB at depth 6, exceeds maximum. HUB ignored
device_attach: uhub13 attach returned 6
ugen2.16: <FiiO FiiO USB DAC-E07K> at usbus2
uhid0 on uhub6
uhid0: <FiiO FiiO USB DAC-E07K, class 0/0, rev 1.10/0.01, addr 15> on usbus2
uaudio0 on uhub6
uaudio0: <FiiO USB DAC-E07K> on usbus2
uaudio0: Play: 96000 Hz, 2 ch, 24-bit S-LE PCM format, 2x8ms buffer.
uaudio0: Play: 48000 Hz, 2 ch, 24-bit S-LE PCM format, 2x8ms buffer.
uaudio0: Play: 44100 Hz, 2 ch, 24-bit S-LE PCM format, 2x8ms buffer.
uaudio0: Play: 32000 Hz, 2 ch, 24-bit S-LE PCM format, 2x8ms buffer.
uaudio0: No recording.
uaudio0: No MIDI sequencer.
pcm3: <USB audio> on uaudio0
uaudio0: No HID volume keys found.

[kjohnson@kyle_workstation01 ~]$ sudo sysctl hw.snd.default_unit=3
hw.snd.default_unit: 0 -> 3

[kjohnson@kyle_workstation01 ~]$ cat /dev/sndstat
Installed devices:
pcm0: <NVIDIA (0x0080) (HDMI/DP 8ch)> (play)
pcm1: <NVIDIA (0x0080) (HDMI/DP 8ch)> (play)
pcm2: <NVIDIA (0x0080) (HDMI/DP 8ch)> (play)
pcm3: <USB audio> (play) default
No devices installed from userspace.

[kjohnson@kyle_workstation01 ~]$ kldstat 
Id Refs Address            Size     Name
 1   51 0xffffffff80200000 1f5c1e8  kernel
 2    1 0xffffffff8215e000 371e50   zfs.ko
 3    2 0xffffffff824d0000 c758     opensolaris.ko
 4    1 0xffffffff824dd000 1e078    snd_uaudio.ko
 5    1 0xffffffff824fc000 3ed0     amdtemp.ko
 6    1 0xffffffff82500000 16d5b8   nvidia-modeset.ko
 7    3 0xffffffff8266e000 b17f0    linux.ko
 8    3 0xffffffff82720000 e1f8     linux_common.ko
 9    2 0xffffffff8272f000 13413c8  nvidia.ko
10    1 0xffffffff83c21000 35ec     ums.ko
11    1 0xffffffff83c25000 2984     uhid.ko
12    1 0xffffffff83c28000 3e587    linux64.ko
13    1 0xffffffff83c67000 ee7f     iscsi.ko
14    1 0xffffffff83c76000 31fe     cpuctl.ko
15    1 0xffffffff83c7a000 bb73     tmpfs.ko


[kjohnson@kyle_workstation01 ~]$ pianobar
Welcome to pianobar (2016.06.02)! Press ? for a list of commands.
(i) Login... Ok.
(i) Get stations... Ok.
|>  Station "The Four Horsemen Radio" (186216222875251678)
(i) Receiving new playlist... Ok.
|>  "Sweating Bullets" by "Megadeth" on "Countdown To Extinction (Deluxe Edition)" <3
#   -04:55/05:02

After a seemingly random amount of time, the system completely locks up - the screen goes blank, the network becomes unresponsive, USB is dead - nothing works. This has happened both minutes after rebooting the system, and up to 7 days after rebooting the system.

Reconnecting the USB keyboard and HDMI monitor doesn't help. The only recourse is to powercycle the system.

Upon boot, there are absolutely no logs indicating pending or present issues - no kernel crash dumps, and nothing in syslog of interest (I also ship syslog to a remote machine and double-check there). This happens under minimal system load.

This issue was happening on 11.1-RELEASE, and is still happening on 11.1-STABLE.
System specs:
  • MSI X370-GAMING-PRO-CARBON (7A32v1C / 2018-01-29 BIOS)
  • AMD Ryzen 7 1700X
  • 2x Samsung 960 EVO 250GB V-NAND M.2 2280 PCIe Gen 3 x4 NVMe SSD in RAID-1 (zroot)
There are 3 USB chipsets on this motherboard - ASMedia ASM2142, AMD X370 and AMD CPU. The USB hub is currently connected to one of the AMD CPU USB 3.1 Gen1 ports, but I was also having this issue when it was connected to one of the ASM2142 USB 3.1 Gen2 ports. I haven't tried the X370 ports.

Any help is appreciated - at this point I've disconnected the DAC the fear of another system crash has spooked me.
 
Nothing is actually crashing, crashes typically result in a panic(9) and/or a dumped core(5). This appears to be a deadlock or a livelock.

When the system locks up are you still able to access the machine over the network? And does it respond to the power button with a graceful shutdown or do you need to force it off?

https://en.wikipedia.org/wiki/Deadlock
 
gnulnx, forgive me for momentarily putting aside the correlation between your DAC and the freezes (since you're a FreeBSD user, I trust that you have substantial reason to have noted this correlation), yet have you already run your computer through at least twenty-four hours of MemTest86+?

The Ryzen-ness of your computer makes me think it's somewhat new, which makes me wonder whether it has already been checked for defective RAM. I've also noticed MemTest86+ identifying busted motherboards and other components (the manifestations of which were RAM reliability problems, spontaneous resets or freezes). I think it's a pretty nice, basic computer stability measurement tool.

Days of Joy to you gnulnx!
 
Nothing is actually crashing, crashes typically result in a panic(9) and/or a dumped core(5). This appears to be a deadlock or a livelock.

When the system locks up are you still able to access the machine over the network? And does it respond to the power button with a graceful shutdown or do you need to force it off?

https://en.wikipedia.org/wiki/Deadlock
Thank you for the clarification - I didn't know what to call it, but as there were no dumped cores, I knew it wasn't a normal 'crash'. (dead|live)lock fits much better.

Negative, the system is completely unreachable over the network, and the system does not gracefully shutdown - I need to force it off.
 
gnulnx, forgive me for momentarily putting aside the correlation between your DAC and the freezes (since you're a FreeBSD user, I trust that you have substantial reason to have noted this correlation), yet have you already run your computer through at least twenty-four hours of MemTest86+?

The Ryzen-ness of your computer makes me think it's somewhat new, which makes me wonder whether it has already been checked for defective RAM. I've also noticed MemTest86+ identifying busted motherboards and other components (the manifestations of which were RAM reliability problems, spontaneous resets or freezes). I think it's a pretty nice, basic computer stability measurement tool.

Days of Joy to you gnulnx!
Embarrassing, no. It has been 13 years since I was working in a brick-and-mortar computer shop doing memtests, and to do one now didn't even cross my mind. Thinking about this issue, I would be surprised if faulty memory were not at play.

I'll run through a shorter one overnight, and then a longer one when I plan on being away from the computer for a while.
 
The CPU and the boards for it are still fairly new, have you checked for any BIOS/UEFI updates? You're connecting the DAC via an USB hub, does it also lock up if you connect the DAC directly (without the USB hub)?
 
With USB 3.0 come many different compatibility settings in the BIOS/UEFI and it helps to play with those, to get things
working. Those ASMedia USB3.0 chips are worth nothing from my experience, stick with the native USB3.0 ports.

But the easiest way to go is to connect your DAC to USB2.0.
You don't need the USB.3.0 bandwidth.
 
There was a pretty long discussion regarding Ryzen issues on the freebsd-stable mailing list recently (perhaps like a month ago). I didn't follow too closely, as I don't have one myself, but it might be worth a look. I think it was, at least in part, about random lockups.
 
  1. Just got through a 44 hour MemTest86+ run with no errors.
  2. I've mostly been keeping the BIOS up to date, but haven't updated since January. Release notes never mention anything relevant FWIW.
  3. Just connected the DAC to another USB 3 port on the same chipset, without the hub. I will try to produce another crash to see whether the hub was contributing to the crash.
    1. I'm using a hub with USB 3 as my computer is rack mounted in a closet 25ft away. The DAC itself sure doesn't need the bandwidth, but I have other things on the hub which do.
  4. If 3 still crashes, then I'll move the DAC over to a USB 2.0 port.
Thanks for the help everyone - I'll keep this updated with my findings.
 
There was a pretty long discussion regarding Ryzen issues on the freebsd-stable mailing list recently (perhaps like a month ago). I didn't follow too closely, as I don't have one myself, but it might be worth a look. I think it was, at least in part, about random lockups.
I just read through that thread, and it sounds very similar to the issues that I am having. I am going to explore a few of their workarounds as well ( specifically https://reviews.freebsd.org/D14347 )
 
Back
Top