Resolved
Connectivity failures (Aruba Support Advisory ARUBA-SA-20210901-PLVL04)
-
Description: Clients have association failures.
This case morphed into the Linux client issue. Linux clients would occasionally just stop passing traffic. The device would still be associated, but it could not even ping the UAC. It was mostly observed on Intel AX200 and AX210 cards, but has also been seen on Intel's AC cards and the MediaTek MT7921K. The problem looked like a driver / kernel issue, but its disappearance is more closely correlated to upgrading to ArubaOS 8.10.
-
Symptoms:
- Clients experience association failures during high bursts of client roaming events.
- High CPU utilization by the Station Management process (
stm
) in the MDs. show papi kernel-socket-stats | include 8345,8222,8419,Drops
Drops
value onport 8419 (STM Low Priority)
rapidly increases in 100+ increments within seconds AND sustained large values forCurRxQLen
andDrops on port 8435 (STM)
,
show cpuload current
stm
process stays over 100%
-
TAC cases:
-
Notable versions:
- 8.7.1.4: observed
- 8.7.1.5: observed
- 8.7.1.6: Sanjay claims a fix
- 8.7.1.6: observed
- 8.10.0.6: presumed fixed
-
Debug: Logs requested by Rodger: Make sure user debug is enabled:
logging user-debug <client-mac> level debug
Currently enabled for waldrep's laptop (46:96:f1:03:32:98
)
no paging show cli-timestamp show clock show ap association client-mac <client-mac> show station-table | include <client-mac> show auth-tracebuf mac <client-mac> show ap client trail-info <client-mac> show datapath session table | include <ip address of client> show log user-debug 50 | include <client-mac> show log security 50 | include <client-mac> show log system 50 | include <Affected_AP_Name> tar log tech-support
Collect the following when at the time of the issue along with tech support logs:
clock cli-timestamp show dot1x watermark history show papi kernelpsocket-stats show ap debug client-mgmt-counters show ap debug sta-msg-stats show ap debug cluster-counters show ap debug gsm-counters show ap debug client-deauth-reason-counters show cpuload current show datapath bwm table show datapath utilization show datapath papi counters show datapath debug opcode show datapath network ingress show datapath maintenance counters show datapath debug dma counters show datapath message-queue counters show auth-tracebuf
Kernel panics
- Description: MD crashes with a kernel panic
- Symptoms
- MD reboots
- Kernel panic
- TAC asked for kernel core dumps. This option has been enabled for a while, but doesn't seem to be giving what they are asking for.
- Intent:cause:registers:
12:86:b0:2
12:86:b0:4
12:86:e0:2
12:86:e0:4
12:86:e0:8
78:86:50:2
(logs lost)
- Bug IDs
- AOS-216744
- TAC cases:
53530244185357725459535887783612:86:b0:4
- JIRA tasks:
- Notable versions:
- 8.5.0.11:
- Observed
12:86:e0:2
- Observed
- 8.7.1.3:
- TAC asserts fixed:
12:86:e0:2
- TAC asserts fixed:
- 8.7.1.4:
- Observed:
12:86:e0:2
- Observed:
- 8.7.1.5:
- TAC asserts fixed:
12:86:e0:2
12:86:e0:4
12:86:b0:4
- Observed:
12:86:b0:2
12:86:b0:4
12:86:e0:8
- TAC asserts fixed:
- 8.7.1.5_81619:
- Observed:
12:86:b0:4
- Observed:
- 8.7.1.6:
- TAC asserts fixed
12:86:b0:2
- TAC asserts fixed
- 8.5.0.11:
res-md-1 refuses clients
- Description: any client trying to use res-md-1 as a UAC cannot associate.
- Symptoms:
show lc-cluster load distribution client
shows 0 active and 0 standby clients for res-md-1.- started with res-md-1 crashing
- persisted across a reboot and code upgrade
- TAC cases
- Notable version:
- 8.7.1.4: crash that initiated the problem
- 8.7.1.5: observed
Holy amon logs, Batman!
- Description:
A debug trace on
amon_sender_proc
andamon_recvr_proc
is logged and cannot be disabled. Collectively, the controllers sent over 20,000 logs/s. The problem only showed up on some boots. - Bug IDs:
- AOS-210452
- TAC cases:
- Notable versions:
- 8.7.0.0: bug introduced
- 8.7.1.4: fixed
- JIRA task:
No state attribute in RADIUS request
- Description
- The RADIUS request packets do not contain the state attribute value and hence, clients face connectivity issue.
- Bug IDs
- AOS-207701
- AOS-218006
- Notable versions:
- 8.4.0.0: introduced
- 8.7.1.3: fixed
Too many pending changes
- Description
- If the expected output of
show configuration unsaved-nodes
was over 1024 characters, then it displayed nothing. - This also impacted API output.
- If the expected output of
- Bug IDs
- AOS-210404
- Notable versions:
- 8.5.0.10: observed broken
- 8.5.0.12: fixed
- 8.7.0.3: fixed