Eric Stern wrote:KBrownConsulting wrote:@Eric
Unless I totally misunderstand what the "Startup" option does, the logs you posted appear to confirm the issue that was reported...
The log you posted shows that after bouncing the port the switch is not waiting the 300 seconds specified as the startup period before pinging the device again. The whole point of the Startup interval (at least my understanding of it) is to give the attached device enough time to reboot after a bounce in order to prevent a boot loop!
Actually the Startup value is only used when the switch first boots, to allow time for the switch to establish full network connectivity before ping monitoring begins.
If the bounced device needs time to reboot etc you need to set the Interval high enough to allow for that. Actually as long as the device comes back up before (Interval * Failures) seconds have passed it should be ok.
Ah, Ok... Are you open to feedback on that implementation? (If this is the wrong place for that kind of discussion let me know & I can move this to a separate topic.)
I guess if I'm honest I'm actually curious if you'd be open to considering changing how the Startup option works?
I'll admit that my expectation of how the Startup option worked was based primarily on my experience with several other manufacturer's devices that have some form of ping watchdog feature, which I know is a case of
but it seems like I was not alone in my assumption of how it "should work" as jschroeter (who originally reported the issue earlier in this thread) clearly was under the same impression & based on Julian's comment
here it would seem like he was too.
More importantly, while I know duplicating another manufacturer's implementation simply to be the same is not a good reason to do anything, hopefully you'd agree that if the Startup option was applied both when the switch boots AND when it bounces a port, that it would make the Watchdog tool a lot more flexible. (For example, I've actually had to go back & change my Watchdog configuration on some switches to insure I don't get boot loops because I actually have a number of devices that I'd like to test every 10 seconds and reboot after 3 failures to make sure they are rebooted within 30 seconds of a problem, but that's not long enough to allow them to reboot! Shoot even 60 seconds is not enough time to allow some devices to reboot, but since the devices being pinged are directly attached to the switch that's more than enough time to establish that there's a problem with the device as I've never had the device drop multiple pings in less than 30 seconds much less 60 seconds unless it's locked up.)
Thoughts? Would you be willing to consider this? Is it even possible to make the Startup option do what I've proposed? (I'd assume it's technically possible, but... yeah.)
If you got this far, thanks for hearing me out.