WS-12-250DC breaks after configuration update
Posted: Thu Apr 19, 2018 11:28 am
We've got a WS-12-250DC switch powering a tower.
We regularly make updates to the switch to add a new vlan. This is the only thing we update : adding a new entry to the VLAN table, applying then backing up the configuration.
The same issue happened twice now, where sometimes, one or two minutes after the update, communication with the switch is lost. Here are the details of what we found after a truck roll to the tower :
#1
- switch appears fine : lights are ON, PoE devices are still powered and did not reboot, but no traffic is going through the switch. FAN seem to be always blowing while usually it's only doing that from time to time, but difficult to say if it's really unusual, maybe it's warmer today or something.
- we unplug power and plug it back to the switch. It reboots but this time all ports lights stays OFF, PoE devices are no longer powered. Still no admin access to the device.
- we exchange the switch with a spare one and inject the backup configuration, and everything comes back online.
- in the lab, we reset the switch. Sadly I don't remember if I tried the soft reset first, but the switch came back after I reset while powering it up. I tested it and everything was operating as expected.
On the syslog server side, the last entry from the switch before the outage was "!Reverting to last known good configuration
".
I tested doing successively many VLAN configuration updates and the switch never had any issue with that. Could not reproduce.
#2 - happens again on the spare (brand new) switch, running up-to-date 1.4.9 firmware, and after several similar updates that went fine
- same symptoms: PoE still ON, FAN active, no traffic, no admin access (did not try the console port - we don't have anything to plug into that thing)
- this time I tried to directly reset it instead of unplugging : I pressed the reset button for a while while it was still on but nothing happened. No light show.
- I then unplugged the switch's power line, and kept the reset button pressed while turning the power back on. Switch went back online, I applied backup configuration and everything went back to normal.
Same last syslog entry before outage : !Reverting to last known good configuration.
We had auto backup feature enabled this time, and the switch did backup it's configuration successfully before (and that's the one we put back in place after resetting). The "Reverting to last..." log entry happened about 60 seconds after the auto backup, which matches with the Revert Timer.
In both cases I'm pretty sure the configuration was applied. It should not have reverted it. And even if it had to revert, it should just come back to the previous version and not lock itself like that...
Anyone ever experienced similar behavior? I don't know how to troubleshoot this further and I can't reproduce in lab (note that it's not under same PoE & traffic load while in lab). Anything I should pay attention to ?
We regularly make updates to the switch to add a new vlan. This is the only thing we update : adding a new entry to the VLAN table, applying then backing up the configuration.
The same issue happened twice now, where sometimes, one or two minutes after the update, communication with the switch is lost. Here are the details of what we found after a truck roll to the tower :
#1
- switch appears fine : lights are ON, PoE devices are still powered and did not reboot, but no traffic is going through the switch. FAN seem to be always blowing while usually it's only doing that from time to time, but difficult to say if it's really unusual, maybe it's warmer today or something.
- we unplug power and plug it back to the switch. It reboots but this time all ports lights stays OFF, PoE devices are no longer powered. Still no admin access to the device.
- we exchange the switch with a spare one and inject the backup configuration, and everything comes back online.
- in the lab, we reset the switch. Sadly I don't remember if I tried the soft reset first, but the switch came back after I reset while powering it up. I tested it and everything was operating as expected.
On the syslog server side, the last entry from the switch before the outage was "!Reverting to last known good configuration
".
I tested doing successively many VLAN configuration updates and the switch never had any issue with that. Could not reproduce.
#2 - happens again on the spare (brand new) switch, running up-to-date 1.4.9 firmware, and after several similar updates that went fine
- same symptoms: PoE still ON, FAN active, no traffic, no admin access (did not try the console port - we don't have anything to plug into that thing)
- this time I tried to directly reset it instead of unplugging : I pressed the reset button for a while while it was still on but nothing happened. No light show.
- I then unplugged the switch's power line, and kept the reset button pressed while turning the power back on. Switch went back online, I applied backup configuration and everything went back to normal.
Same last syslog entry before outage : !Reverting to last known good configuration.
We had auto backup feature enabled this time, and the switch did backup it's configuration successfully before (and that's the one we put back in place after resetting). The "Reverting to last..." log entry happened about 60 seconds after the auto backup, which matches with the Revert Timer.
In both cases I'm pretty sure the configuration was applied. It should not have reverted it. And even if it had to revert, it should just come back to the previous version and not lock itself like that...
Anyone ever experienced similar behavior? I don't know how to troubleshoot this further and I can't reproduce in lab (note that it's not under same PoE & traffic load while in lab). Anything I should pay attention to ?