Page 1 of 1

An issue to perplex you.

Posted: Wed May 09, 2018 3:08 pm
by mlow
I'm facing a rather perplexing issue since Tuesday morning, when our entire site rebooted due to an assumed power anomaly (during a storm with lightning).

Since then, devices on the switch's (a WS-12-DC) management VLAN have been showing intermittent connectivity, there's between 20 and 30% packet loss depending on how long a ping is run for. Pings will be coming in normally and stop for between 5-10 seconds randomly. However, if those devices get reconfigured with an IP on a different VLAN of the switch, the issue goes away. So, I get packet loss pinging the management IP of an access point, but not the management IPs of clients connected to it, as they are on a separate VLAN.

I've done as much testing as I can to determine whether it's the site's router, the switch, or the radios, and everything is pointing to the switch at this point. The management VLAN houses only the APs, the switch management, and tech laptop ports. It's sent over to the site's router over a VLAN 1000 on a LACP-A (along with every other VLAN).

I know it's not the router because the issue exists between two devices on the same VLAN with direct L2 connectivity.

I *think* it's not the radios because the issue is only present on this single VLAN going to the radios - others are unaffected. After a re-configuring a backhaul radio (which was showing absolutely no issues) management VLAN to the switch's management VLAN, that backhaul radio then started the same issue.

None of the radios are having issues forming 1G links to the switch, there are no errors showing on any of the switch ports or the radios.

Weird? :roll: I haven't gone as far as changing the management VLAN's ID to see if the issue is specifically related to the VLAN ID in use; that is something I can wait to test when the switch is out of production. Really stumped how a single VLAN could be affected. I should note that pinging the switch ITSELF results in no issues at all, only with traffic passing through the management VLAN.

We're not in a panic since only the management VLAN is affected, so there's no impact to client traffic. Will be replacing this switch when the weather clears up and will let everyone know what happens...

Re: An issue to perplex you.

Posted: Wed May 09, 2018 4:49 pm
by sirhc
I have no idea what version you are on so just for shits and goggle please upgrade to v1.5.0rc4 and see what happens

We always ask when you run into a problem to first upgrade to latest version as that way we are working with the current code.

Re: An issue to perplex you.

Posted: Fri Jun 29, 2018 8:03 am
by mlow
Hey Chris,
At the time of posting, I was running the latest version (1.4.9). I've just tried updating to 1.5.0 and am still seeing the exact same behaviour.
Not sure what to do next besides replacing the switch.

Re: An issue to perplex you.

Posted: Fri Jun 29, 2018 11:28 am
by sirhc
mlow wrote:Hey Chris,
At the time of posting, I was running the latest version (1.4.9). I've just tried updating to 1.5.0 and am still seeing the exact same behaviour.
Not sure what to do next besides replacing the switch.


Well swapping out the switch with a spare would tell you if you have damage to the switch or not.

After you remove it and of the problem goes away you can bench test it as described in detail here: viewtopic.php?f=6&t=2780#p19221

Did you look at all the interface statistics on the switch/router/devices to see if your seeing errors on the interfaces?

Did you look at the Status TAB to see if all the current sensors are working properly? (showing proper watts)
Did you look on the Device/Status TAB to see if all telemetry looks correct?

I assume all ports are set to AUTO?

You can also try disabling Flow Control on all ports and see what happens. When you looked at ports statistics did you see excessive Pause Frames on any ports.

Lots of things you can do to figure this out.