msk0: watchdog timeout

Artefact2

Member


Messages: 23

Hello there,

Sometimes (probably once a week or so), my ethernet device (msk) will stop working and flood the first tty with "msk0: watchdog timeout"...

The only fix is to restart the machine, but that's kind of annoying because I can only access it via SSH... And, obviously, when the network is down, is too late...

I tried disabling MSI (by putting hw.pci.enable_msix=0 and hw.pci.enable_msi=0 in /boot/loader.conf), but that didn't solve the problem. It happens under moderate/heavy network load.

Anyone here having similar issues ? How to fix them ? Thanks.

FYI: my network chip is a Marvell 88E8053, I use FreeBSD 8.0-RELEASE-p1.
 

multibyte

New Member


Messages: 9

If the problem is related to the network cable (as mentioned in the
handbook - network - troubleshooting), you should also check your
BIOS settings if a "LAN CABLE STATUS" check, when existend, is
executed. If it is, try the "DISABLE" setting.
 

mky

Member

Reaction score: 4
Messages: 34

I have similar problem, but i was solved it disabling msi and msix in /boot/loader.conf and disabling TSO (net.inet.tcp.tso is set to "zero" in /etc/sysctl.conf). Active TSO causes many connection terminates, when network interfese was some load (~2 Mbit/s). The same issues i had on FreeBSD/i386 7.2-RELEASE (+ patches up to -p4) and currently on FreeBSD/amd64 8.0-RELEASE (+ patches up to latest -p2).

I'm using ASUS P5GDC PRO motherboard with Marvell Yukon 88E8053 Gigabit Ethernet on board. I have some error about msk in dmesg:

Code:
mskc0: Uncorrectable PCI Express error
but my network inteface works very stable and without errors.
 
OP
OP
Artefact2

Artefact2

Member


Messages: 23

mky said:
I have similar problem, but i was solved it disabling msi and msix in /boot/loader.conf and disabling TSO (net.inet.tcp.tso is set to "zero" in /etc/sysctl.conf). Active TSO causes many connection terminates, when network interfese was some load (~2 Mbit/s). The same issues i had on FreeBSD/i386 7.2-RELEASE (+ patches up to -p4) and currently on FreeBSD/amd64 8.0-RELEASE (+ patches up to latest -p2).

I'm using ASUS P5GDC PRO motherboard with Marvell Yukon 88E8053 Gigabit Ethernet on board. I have some error about msk in dmesg:

Code:
mskc0: Uncorrectable PCI Express error
but my network inteface works very stable and without errors.
I will try that. Thanks.
 

epopen

Active Member

Reaction score: 8
Messages: 130

Hi all.

I encounter the same issue (msk0: Watchdog timeout, and NIC corrupt) in FreeBSD 10.0-RELEASE #0 amd64 @ Marvell Yukon 88E8040T PCI-E adapter. The issue first appeared in 8-RELEASE, it is caused by a PCIB ACPI issue, and patched kernel source code fixed it. And it disappeared in 9-RELEASE. But it appeared 10.0-RELEASE again! Disabling MSI and disabling TSO won't have an effect, and plugging/unplugging the cable has the same issue. Solution searching...

Thanks a lot.
 

naegelejd

New Member


Messages: 1

I have the same issue with my Marvell Yukon 88E8057. I tried 9.2, 10.0-RELEASE, 10.0-STABLE, and 11.0-CURRENT. My NIC only works on 9.2. What is really confusing is that the few lines of code in /usr/src/sys/dev/msk/if_msk.c that changed between 9.2 and 10.0 are reverted in 11.0, but the problem still exists.
 

epopen

Active Member

Reaction score: 8
Messages: 130

Hi everyone. I have been testing the new 10.1-RELEASE. Still same the problem. Thanks a lot.
 

waywardnl

Member


Messages: 28

In FreeBSD 9.3 did this:
Code:
root@BSD05:/home/roland # sysctl net.inet.tcp.tso=0
net.inet.tcp.tso: 1 -> 0
And I have the same error, but my RDP session of Windows 7 keeps alive.

It happens when doing a copy with cp over the network and RDP together in Virtualbox.
 

epopen

Active Member

Reaction score: 8
Messages: 130

Hi everyone.
Bad news.
I have been testing the new 10.2-RELEASE AMD64. Still same the problem.
Thanks a lot.
 

jkhilmer

New Member

Reaction score: 1
Messages: 6

The patch by Yonghyeon Pyun worked for me with 10.2 AMD64 (https://lists.freebsd.org/pipermail/freebsd-stable/2015-April/082245.html)

Code:
Index: sys/dev/msk/if_mskreg.h
===================================================================
--- sys/dev/msk/if_mskreg.h   (revision 281587)
+++ sys/dev/msk/if_mskreg.h   (working copy)
@@ -2175,13 +2175,8 @@
 #define MSK_ADDR_LO(x)   ((uint64_t) (x) & 0xffffffffUL)
 #define MSK_ADDR_HI(x)   ((uint64_t) (x) >> 32)
-/*
- * At first I guessed 8 bytes, the size of a single descriptor, would be
- * required alignment constraints. But, it seems that Yukon II have 4096
- * bytes boundary alignment constraints.
- */
-#define MSK_RING_ALIGN   4096
-#define   MSK_STAT_ALIGN   4096
+#define   MSK_RING_ALIGN   32768
+#define   MSK_STAT_ALIGN   32768
 /* Rx descriptor data structure */
 struct msk_rx_desc {
 
  • Thanks
Reactions: Jon

Jon

New Member


Messages: 1

I just wanted to add a 'worked for me too' to the above mentioned patch, also on 10.2 AMD64

Without patch, the NIC would stop working after some time, and ttyv0 would be 'spammed' with "msk0: watchdog timeout" messages.

With the patch, the NIC has been working fine a couple of days - with pretty heavy traffic too
 

herrbischoff

Active Member

Reaction score: 73
Messages: 177

Unfortunately I still have the same issue with a Marvell 88E8053 in an older Mac mini on 10.3-RELEASE which includes the patch. Running Debian on this machine never produced a similar problem. I would very much want to use FreeBSD though.

I also tried creating a new driver via ndisgen and the Windows XP driver but it does not seem to work. The kernel module gets built, I can load it but the interface is still not recognized. Now I'm trying if reverting to 9.3-RELEASE may get it to work.

Any further ideas or thoughts on this?
 

DonColeman

New Member


Messages: 1

Same problem: I'm also running an older macmini w/10.3-RELEASE:

FreeBSD 10.3-RELEASE-p4 #0: Sat May 28 09:52:35 UTC 2016
root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC i386
CPU: Genuine Intel(R) CPU 1300 @ 1.66GHz (1666.67-MHz 686-class CPU)
msk0: <Marvell Technology Group Ltd. Yukon EC Id 0xb6 Rev 0x02> on mskc0

Depending on network use, I get the "watchdog" message about once a day.

The work-around I use is to "bounce" into suspend/resume -- I have some automated scripts to deal with this, but basically you just:

sysctl debug.acpi.suspend_bounce=1
/usr/sbin/acpiconf -s 3
sysctl debug.acpi.suspend_bounce=0

Make sure the sound system is not open when doing this.
 

darmokandjalad

New Member


Messages: 1

For the record, also seeing this problem on a newly re-purposed MacMini2,1 running 11.2-RELEASE-p4 (i386). I'd been running it on 11.2-RELEASE for a week, and only noticed the problem after applying the -p4 update, while I was running portsnap extract.

Thanks, DonColeman; the work-around solved the problem.
 
Top