Netonix Forums

Posted: **Sat Sep 22, 2018 8:05 am**

A couple of days ago we saw a suspected momentary power outage / brownout / spike across a big part of our network where around a third of our UBNT radios received fresh uptimes. We have 68 Netonix switches on the network and about 10 of them rebooted immediately. Another 30 went offline according to Netonix Manager and PRTG but actually they were still passing traffic albeit with a slow web interface and little or no SNMP.

Using SSH was hopeless but in the end I managed to reboot most of the switches using the web GUI and then they were fine. It took about 4 hours of watching browser tabs doing very little before I could eventually login and click the reboot button. Some switches showed CPU at 100% whilst I was struggling to access them, others failed to do full page loads and there were missing style elements etc.

I had to 'truck roll' to three switches which I couldn't access via web GUI due to constant timeouts and the last one I eventually rebooted remotely this morning after a further three hours of using a browser and having to retry the browser connection many times.

This is our model breakdown:
49 x WS-6-MINI
9 x WS-12-250-DC
4 x WS-8-150-DC
4 x WS-8-250-DC
1 x WS-12-250-AC
1 x WS-26-400-AC

Anyway I was just wondering if anything could be done to prevent this type of situation happening in future? I'm pretty sure the problem was power-related rather than network-related (packet storm?) due to the fresh uptimes. The switches are mostly running 1.4.9 but a few are on 1.5.0 and the problem affected them too. Maybe a future firmware release could make the CPU more tolerant to 'blips'? Might it be worth trying some experiments in the lab to test resilience against short power outages?

I'm not complaining, just sharing experience. And if it helps make the Netonix already-great products even better then that's a bonus!

Keep up the good work,

Thanks
Glenn

Posted: **Sat Sep 22, 2018 9:15 am**

With power blips the problem is the switch starts to shut down and power is restored and the unit did not completely shut down before power is restored. This leaves the electronics in a random state.

Nothing can be done in software to fix this, the best thing is to have proper UPS systems that has a fast enough switching time.

This is not something unique to Netonix but rather all embedded electronics are susceptible to this issue. Now our POE switches are not able to handle blips as well as say a low power system because we are pushing through many watts of power and the onboard CAPs can not hold that much verses say a calculator else our CAPS would be HUGE like coffee can size.

Posted: **Sat Sep 22, 2018 10:44 am**

Thanks, that's very interesting and makes sense. Having a UPS at every location is a little impractical for us so I guess we will just live with the risk of it happening again. It's only happened once so far that I know of!

Posted: **Thu Sep 27, 2018 2:21 am**

I would assume that most of the issues occurred on ws-6-mini's since most of your larger switches are dc? If that's the case are you using netonix power adapters or ubiquiti AF adapters to power the switches? We've had far few issues with the Netonix adapters.

I'm not sure if it makes logistical sense, but you could use some LPC7-PRO power strips so you can power cycle remotely ;)

Posted: **Thu Sep 27, 2018 2:53 am**

Thanks for the reply LRL - strangely, the problem affected some DC switches as well as Mini's. I can't explain that, since the DC units are connected to a 12V battery which is being trickle charged from a vehicle smart charger. So maybe it was a network problem rather than a power problem, but then why all the fresh UBNT uptimes? I'm puzzled. I guess if it doesn't happen frequently then it's no big deal.

We're using 50V (48VH) airFiber adapters to power the DC switches.. this seems a reliable way of doing it on the whole.

A product like the LPC7-PRO would certainly have saved us having to visit some towers! We'd probably need a UK-socket version and also we'd probably need a lot of them! But as the issue has so far been only a one-off, I suppose we'll take no action for now.

Posted: **Thu Sep 27, 2018 8:56 am**

As you describe, the issue seems more like a bridge loop kind of issue. The loop could still be the result of power issues, but highly unlikely a power issue would affect DC switches connected to batteries.

A lot of devices do strange things when faced with a bridge loop.

Posted: **Tue Oct 09, 2018 4:16 pm**

LRL wrote:As you describe, the issue seems more like a bridge loop kind of issue. The loop could still be the result of power issues, but highly unlikely a power issue would affect DC switches connected to batteries.

A lot of devices do strange things when faced with a bridge loop.

Thanks, I have come to the conclusion you're right - I don't think the issue was power-related because it is regularly persisting across many different parts of the network and includes battery-powered switches. I've made a post on this thread to hopefully continue the discussion about finding a resolution.

Netonix Forums

Power blip causes frozen web / ssh / snmp

Power blip causes frozen web / ssh / snmp

Re: Power blip causes frozen web / ssh / snmp

Re: Power blip causes frozen web / ssh / snmp

Re: Power blip causes frozen web / ssh / snmp

Re: Power blip causes frozen web / ssh / snmp

Re: Power blip causes frozen web / ssh / snmp

Re: Power blip causes frozen web / ssh / snmp