Solved [Solved] Yet another Atheros upgrade horror story

rambetter

Member

Reaction score: 8
Messages: 88

I'm posting this here because after hours of research online I have not found a solution to my problem.

I was running 7.1. But since it's EOL now, I upgraded to 7.3, and then to 7.4 (upgraded twice trying to solve the problem below).

From dmesg:

Code:
ath0: <Atheros 5212> mem 0xff8f0000-0xff8fffff irq 21 at device 0.0 on pci1
ath0: [ITHREAD]
ath0: WARNING: using obsoleted if_watchdog interface
ath0: Ethernet address: 00:02:6f:61:e6:7d
ath0: mac 7.9 phy 4.5 radio 5.6
Running GENERIC kernel in 7.4 i386.

Getting these every few milliseconds:

Code:
ath0: stuck beacon; resetting (bmiss count 4)
ath0: stuck beacon; resetting (bmiss count 4)
ath0: stuck beacon; resetting (bmiss count 4)
ath0: stuck beacon; resetting (bmiss count 4)
ath0: stuck beacon; resetting (bmiss count 4)
ath0: stuck beacon; resetting (bmiss count 4)
ath0: stuck beacon; resetting (bmiss count 4)
ath0: stuck beacon; resetting (bmiss count 4)
ath0: stuck beacon; resetting (bmiss count 4)
ath0: stuck beacon; resetting (bmiss count 4)
ath0: stuck beacon; resetting (bmiss count 4)
Here is my /etc/rc.conf:

Code:
gateway_enable="YES"
hostname="speedy.i"
ifconfig_fxp2="DHCP"
cloned_interfaces="bridge0"
ifconfig_bridge0="addm re0 addm ath0 addm fxp0 addm fxp1 up"
ifconfig_re0="up"
ifconfig_ath0="ssid speedy.i mode 11g mediaopt hostap channel 11 up"
ifconfig_fxp0="up"
ifconfig_fxp1="up"
ipv4_addrs_bridge0="192.168.0.254/24"
ipnat_enable="YES"
hostapd_enable="YES"
sshd_enable="YES"
named_enable="YES"
ntpdate_enable="YES"
ntpd_enable="YES"
dhcpd_enable="YES"
dhcpd_ifaces="bridge0"
My /boot/loader.conf:

Code:
autoboot_delay="6"
loader_logo="beastie"
password="xxxxx"
snd_ich_load="YES"
wlan_xauth_load="YES"
My wireless nic is acting as an access point and it's behaving horribly since the upgrade to 7.3 (and to 7.4).

I read this in /usr/src/UPDATING:

Code:
20090312:
        The open-source Atheros HAL has been merged from HEAD
        to STABLE.
        The kernel compile-time option AH_SUPPORT_AR5416 has been
	added to support certain newer Atheros parts, particularly
        PCI-Express chipsets.
        The following modules are no longer available, and should be
        removed from MODULES_OVERRIDE and/or loader.conf:-
         ath_hal ath_rate_amrr ath_rate_onoe ath_rate_sample
I'm now trying to fiddle with the kernel config file, trying to disable AH_SUPPORT_AR5416, ath_hal, and ath_rate_sample (leaving ath itself). Wondering why in the comment above it says "should be removed", yet I still see devices ath_hal and ath_rate_sample in the GENERIC kernel config.

Am I barking up the right tree? Is there any simple way to get the ath driver from 7.1 back? Any other solutions to this problem?

I'm a bit surprised that _so_many_people_ have had this problem and there is no definite solution, even after 1+ years.

EDIT: Here is one more file on my computer, /etc/hostapd.conf:

Code:
/etc/hostapd.conf
speedy# less /etc/hostapd.conf 
interface=ath0
debug=1
ctrl_interface=/var/run/hostapd
ctrl_interface_group=wheel
ssid=speedy.i
hw_mode=g
wpa=3
wpa_passphrase=yyyyyyyy
wpa_key_mgmt=WPA-PSK
wpa_pairwise=TKIP CCMP
I'll be very happy to post any other info that is needed. I'm willing to try almost anything since I've already spent many hours trying to figure out this problem.
 
OP
OP
rambetter

rambetter

Member

Reaction score: 8
Messages: 88

I want to thank whoever formatted my post.
I actually have the formatting instructions bookmarked but I was in a hurry to make this post. I'll start using the formatting in my next post probably.

Also, I'll post the solution once I find it. I'm now trying various things.
 
OP
OP
rambetter

rambetter

Member

Reaction score: 8
Messages: 88

Oh wow, I fixed this problem. But there is still a bug in the underlying driver in my opinion.

My fix was the change a single line in /etc/rc.conf. The original line:

Code:
ifconfig_ath0="ssid speedy.i mode 11g mediaopt hostap channel 11 up"
The new line:

Code:
ifconfig_ath0="ssid speedy.i mode 11g mediaopt hostap channel 2 -bgscan up"
Actually, the "-bgscan" has nothing to do with the fix. The fix is simply changing the channel from 11 to 2 [in my case].

Let me give you some details. Now I have come to some conclusions and they may be totally wrong. I formulated some theories to try to explain my problems with the ath driver.

1. I live in an affluent neighborhood, surrounded by houses in the front, back, and sides. There are many houses in close proximity and they probably all have wireless access points.

2. When I tested my changes, I turned hostapd off. I tested my access point completely unsecured, in order to rule out any extra things that may be interfering with my tests.

3. I tried using a chanlist of "1-5,7-11" instead of using a hardcoded channel such as 2. However, when using the chanlist the channel is chosen in some manner, and half the time when I reboot my router the channel that is chosen will cause the "stuck beacon" problem mentioned above.

4. I feel very dirty inside that I had to choose a channel such as 2. Would it not be best for the driver to choose a channel based on which channel is the most "free"?


I see 2 bugs here in the ath driver:

A. If the channel has a lot of noise from neighbors, the "stuck beacon" kernel logs will start to flood the system. These slow down the entire system and cause all sorts of network problems to happen, not only in the wireless interfaces.

B. Seems that auto selecting an appropriate channel (and maybe even changing the channel during runtime) can be improved. I don't really know the 802 protocols, and I definitely don't know kernel driver programming. So I'm just blowing smoke out of my ass.

In any case I'm appalled that such a buggy driver that is used by so many wireless NICs has made it into FreeBSD release. This is FreeBSD 7.4 I'm talking about, but I'm sure others are affected. This used to not be a problem in 7.1, like I mentioned earlier.
 
OP
OP
rambetter

rambetter

Member

Reaction score: 8
Messages: 88

I'd like to get to the bottom of this issue w/o changing the channel I'm using every time this problem occurs. I tried looking at sysctl variables for ath.

Code:
sysctl -a | grep hw\.ath

Code:
hw.ath.txbuf: 200
hw.ath.rxbuf: 40
hw.ath.regdomain: 0
hw.ath.countrycode: 0
hw.ath.xchanmode: 1
hw.ath.outdoor: 1
hw.ath.calibrate: 30
hw.ath.hal.swba_backoff: 0
hw.ath.hal.sw_brt: 10
hw.ath.hal.dma_brt: 2
I cannot find any information on the internet regarding what these do, short of reading the driver source code. Do you think there's a chance that I can tweak these settings to avoid the "stuck beacon" problem on a busy channel?

It seems that the real solution here is to get my hands dirty in the driver source code. Unfortunately I'm already helping in too many other open source projects and cannot start another one.
 

wblock@

Beastie Himself
Developer

Reaction score: 3,645
Messages: 13,850

Finding the problem is half the battle. Submit a PR so it's documented and others can work on it.
 

bartgrefte

Member


Messages: 20

Got that error as well with my Atheros card in combination with pfSense 2 beta (early feb. snapshot).
Don't know which specific setting it was that I changed (since I changed a lot), but I'm no longer getting the error and my wifi card (D-Link DWA-552) actually worked, in accesspoint-mode without N-features that is.

Was planning on trying a newer version of the ath(4) driver, but I haven't been able to compile it.
See http://forums.freebsd.org/showthread.php?t=215

On the pfSense forum there is another with this problem: http://forum.pfsense.org/index.php/topic,32041.0.html
Topic is still active.
 
OP
OP
rambetter

rambetter

Member

Reaction score: 8
Messages: 88

I had a look at those threads. You say you don't know what you changed that fixed your problem. I'll bet it's the channel you're using. Do you live on a farm away from civilization or in the middle of a crowded street? Would be nice to know if you can confirm that your problem depends on the channel selected (assuming you live on the busy street).
 

bartgrefte

Member


Messages: 20

Hmm, okay.
Well, have a look at this:

Think this will tell you enough ;)

I live in the middle of a residential district. In this area: Google Maps. Not sure if that qualifies as crowded, but all channels are busy...

Plus I can't seem to get hostap working anymore:
Code:
Mar 1 15:06:35 hostapd: ath0_wlan0: STA 00:27:10:ce:03:6c IEEE 802.11: deassociated 
Mar 1 15:06:35 hostapd: ath0_wlan0: STA 00:27:10:ce:03:6c IEEE 802.11: deauthenticated due to local deauth request 
Mar 1 15:06:32 hostapd: ath0_wlan0: STA 00:27:10:ce:03:6c IEEE 802.11: associated 
Mar 1 15:06:32 hostapd: ath0_wlan0: STA 00:27:10:ce:03:6c IEEE 802.11: deassociated 
Mar 1 15:06:32 hostapd: ath0_wlan0: STA 00:27:10:ce:03:6c IEEE 802.11: deauthenticated due to local deauth request 
Mar 1 15:06:31 hostapd: ath0_wlan0: STA 00:27:10:ce:03:6c WPA: received EAPOL-Key 2/4 Pairwise with unexpected replay counter 
Mar 1 15:06:31 hostapd: ath0_wlan0: STA 00:27:10:ce:03:6c IEEE 802.11: associated
Think the upgrade to RC1 broke it. Clean install did not help.
 
OP
OP
rambetter

rambetter

Member

Reaction score: 8
Messages: 88

What kind of structure is required to block waves at these frequencies? Like a fine cage made of metal? Would be interested to see how your hostap behaves with and without other wireless devices interference.

BTW I'm not a developer at FreeBSD. So I really don't know what's going on. I am just speculating.
 

bartgrefte

Member


Messages: 20

Don't know:\
That might be hard to test, don't know a place where I can test without interference. My hostap is not in there btw, the Senao 3220 I'm still using atm is (Raven's Lair). With that one I can barely make 250-300kBps, thanks to the interference.

I can have a look tomorrow to see how it displays in inSSIDer.

edit: Here are two screenshots made on my laptop which has an Intel 6200AGN wificard.
One is with the standard "hidden behind the computercase"(as I like to call them)-antennas connected to the D-Link DWA-552, the other with this one from Ebay.



Note: Looks less crowded, but it's not ;). The wifi NIC in my laptop does not pick up that much unlike the (in this country illegal) NIC I used to make the screenshot few posts earlier. Plus I held that one against the window.
Note2: The screen got flooded with the "stuck beacon"-errors again....
 

gnoma

Active Member

Reaction score: 17
Messages: 182

Re: Yet another Atheros upgrade horror story

Hello,

I got this issue too on my FreeBSD 9.0 and 9.1:
Code:
ath0: stuck beacon; resetting (bmiss count 4)
I wanted to ask if anybody knows if this is fixed in FreeBSD 10. If it is, this is a good reason to update :)

Thank you.
 

ondra_knezour

Aspiring Daemon

Reaction score: 198
Messages: 764

Re: Yet another Atheros upgrade horror story

From the Atheros wiki page, fixed issues section
Frames shouldn't be flushed from the TX queue on an interface reset (eg stuck beacon) !
So upgrade may help?

Also if you face this problem quite often, you can boot from the live CD, connect and try to send/receive some data.
 

gnoma

Active Member

Reaction score: 17
Messages: 182

Re: Yet another Atheros upgrade horror story

Hello,

Thank you very much, now the driver works great:
Code:
ath0@pci0:3:0:0:	class=0x028000 card=0x30a4168c chip=0x002e168c rev=0x01 hdr=0x00
    vendor     = 'Atheros Communications Inc.'
    device     = 'AR9287 Wireless Network Adapter (PCI-Express)'
    class      = network
Configuration:
Code:
wlan0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	ether 64:70:02:6e:fc:9d
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: IEEE 802.11 Wireless Ethernet autoselect mode 11ng <hostap>
	status: running
	ssid UnixFBSD channel 6 (2437 MHz 11g ht/40+) bssid 64:70:02:6e:fc:9d
	regdomain 32924 country CN indoor ecm authmode WPA2/802.11i
	privacy MIXED deftxkey 2 AES-CCM 2:128-bit txpower 20 scanvalid 60
	protmode CTS ampdulimit 64k ampdudensity 8 shortgi wme burst
	dtimperiod 1 -dfs
And the results:
Code:
bash-4.2$ scp user@192.168.1.1:/usr/home/user/random.file ./
Password for gnoma@sentinel.mynet.com:
random.file                                   100% 1024MB   10.9MB/s   01:44
I would say that the driver is fixed and now working great!
 
Top