An issue to perplex you.
Posted: Wed May 09, 2018 3:08 pm
I'm facing a rather perplexing issue since Tuesday morning, when our entire site rebooted due to an assumed power anomaly (during a storm with lightning).
Since then, devices on the switch's (a WS-12-DC) management VLAN have been showing intermittent connectivity, there's between 20 and 30% packet loss depending on how long a ping is run for. Pings will be coming in normally and stop for between 5-10 seconds randomly. However, if those devices get reconfigured with an IP on a different VLAN of the switch, the issue goes away. So, I get packet loss pinging the management IP of an access point, but not the management IPs of clients connected to it, as they are on a separate VLAN.
I've done as much testing as I can to determine whether it's the site's router, the switch, or the radios, and everything is pointing to the switch at this point. The management VLAN houses only the APs, the switch management, and tech laptop ports. It's sent over to the site's router over a VLAN 1000 on a LACP-A (along with every other VLAN).
I know it's not the router because the issue exists between two devices on the same VLAN with direct L2 connectivity.
I *think* it's not the radios because the issue is only present on this single VLAN going to the radios - others are unaffected. After a re-configuring a backhaul radio (which was showing absolutely no issues) management VLAN to the switch's management VLAN, that backhaul radio then started the same issue.
None of the radios are having issues forming 1G links to the switch, there are no errors showing on any of the switch ports or the radios.
Weird? I haven't gone as far as changing the management VLAN's ID to see if the issue is specifically related to the VLAN ID in use; that is something I can wait to test when the switch is out of production. Really stumped how a single VLAN could be affected. I should note that pinging the switch ITSELF results in no issues at all, only with traffic passing through the management VLAN.
We're not in a panic since only the management VLAN is affected, so there's no impact to client traffic. Will be replacing this switch when the weather clears up and will let everyone know what happens...
Since then, devices on the switch's (a WS-12-DC) management VLAN have been showing intermittent connectivity, there's between 20 and 30% packet loss depending on how long a ping is run for. Pings will be coming in normally and stop for between 5-10 seconds randomly. However, if those devices get reconfigured with an IP on a different VLAN of the switch, the issue goes away. So, I get packet loss pinging the management IP of an access point, but not the management IPs of clients connected to it, as they are on a separate VLAN.
I've done as much testing as I can to determine whether it's the site's router, the switch, or the radios, and everything is pointing to the switch at this point. The management VLAN houses only the APs, the switch management, and tech laptop ports. It's sent over to the site's router over a VLAN 1000 on a LACP-A (along with every other VLAN).
I know it's not the router because the issue exists between two devices on the same VLAN with direct L2 connectivity.
I *think* it's not the radios because the issue is only present on this single VLAN going to the radios - others are unaffected. After a re-configuring a backhaul radio (which was showing absolutely no issues) management VLAN to the switch's management VLAN, that backhaul radio then started the same issue.
None of the radios are having issues forming 1G links to the switch, there are no errors showing on any of the switch ports or the radios.
Weird? I haven't gone as far as changing the management VLAN's ID to see if the issue is specifically related to the VLAN ID in use; that is something I can wait to test when the switch is out of production. Really stumped how a single VLAN could be affected. I should note that pinging the switch ITSELF results in no issues at all, only with traffic passing through the management VLAN.
We're not in a panic since only the management VLAN is affected, so there's no impact to client traffic. Will be replacing this switch when the weather clears up and will let everyone know what happens...