Page 1 of 6

Switch reboots - High mem usage

Posted: Wed Oct 24, 2018 3:45 pm
by Kingsley
Hi there,

We have a continuing issue of switches rebooting and it would seem it's because they run out of memory. We run a 'headend' router setup so there is a lot of L2 but it is managed extremely well and it works great except these constant reboots. It is happening on multiple F/W versions. We use STP, we also poll the switches via SNMP with LibreNMS and ping them with ICMP with Smokeping. I have tried disabling the discover options and this hasn't helped.

It's like a memory leak?

fh-dud-device-status.PNG

fh-dud-ports.PNG

fh-dud-status.PNG

Netonix-mem-graph.png


Regards
Kingsley

Re: Switch reboots - High mem usage

Posted: Wed Oct 24, 2018 4:22 pm
by Banana Jack
This is a great question. I'm getting regular cold (watchdog) reboots too and am still trying to trace the cause. I'm suspecting high memory usage, or too much broadcast traffic (storms/loops) or one being caused by the other, or something else I haven't thought of yet.

Can anyone give any info about the 'Memory Usage' shown on the Status page? Presumably low/medium is good and high (red) is bad? What causes it to rise? What can I do to stop it rising? What are the effects of it being too high? Should I lie awake at night worrying about it?

Glenn

Re: Switch reboots - High mem usage

Posted: Wed Oct 24, 2018 4:24 pm
by sirhc
To confirm you are disabling the Discovery TAB?

How often are you polling the switch with SNMP, I would not poll it more than once every 30 seconds.

How many UI logins are you doing at the same time? Each UI/CLI login increases memory usage.

Go to console/cli either with PUTTY or in the web UI TAB and drop to linux and run TOP

see where the memory is going and post up a screen grab

Re: Switch reboots - High mem usage

Posted: Wed Oct 24, 2018 4:41 pm
by Banana Jack
Hi sirhc, I think you were replying to Kingsley, and I don't mean to hijack the thread, but in case you can help me too, here is a screen grab from my 'top' :

top.PNG


I'm polling with SNMP every 60 secs from PRTG. I'm using Netonix Manager 1.0.10. I'm logging in only once or twice simultaneously with the Web GUI and only on ad-hoc switches when I want to check on them. Every switch on my network is suffering from reboots (I have 69 switches).

Thanks
Glenn

Re: Switch reboots - High mem usage

Posted: Thu Oct 25, 2018 4:05 pm
by Kingsley
As I stated in my first post, discovery is disabled. We very rarely log into the UI at all, all our monitoring is from LibreNMS but if we have to login, generally only one user at a time. We don;t have storms and/or loops.

Here is the TOP output, any ideas?
top.PNG

Re: Switch reboots - High mem usage

Posted: Fri Oct 26, 2018 10:49 am
by Eric Stern
Kingsley wrote:As I stated in my first post, discovery is disabled. We very rarely log into the UI at all, all our monitoring is from LibreNMS but if we have to login, generally only one user at a time. We don;t have storms and/or loops.

Here is the TOP output, any ideas?


The memory usage looks completely normal to me. The usage will increase over time as the OS utilizes free RAM for caching until it reaches some peak. You can see this on your graph as it peaks around week 36 then goes flat. Your reboot doesn't occur until 4 weeks later.

Everything in your screenshots looks fine. I'd suggest swapping in a new switch to see if its a hardware issue. Maybe also review your grounding, as that is a never ending source of problems.

Re: Switch reboots - High mem usage

Posted: Fri Oct 26, 2018 10:54 am
by Eric Stern
Banana Jack wrote:Hi sirhc, I think you were replying to Kingsley, and I don't mean to hijack the thread, but in case you can help me too, here is a screen grab from my 'top' :

top.PNG


I'm polling with SNMP every 60 secs from PRTG. I'm using Netonix Manager 1.0.10. I'm logging in only once or twice simultaneously with the Web GUI and only on ad-hoc switches when I want to check on them. Every switch on my network is suffering from reboots (I have 69 switches).

Thanks
Glenn


The Discovery process is taking up an awful lot of CPU time here, it really should never be more than about 5%. That would indicate either you are running a large flat L2 network (which is a bad idea) or something else is causing excessive discovery traffic on the network. You could try disabling discovery on the switch temporarily to see if the reboots stop.

Re: Switch reboots - High mem usage

Posted: Fri Oct 26, 2018 3:37 pm
by Banana Jack
Eric Stern wrote:The Discovery process is taking up an awful lot of CPU time here, it really should never be more than about 5%. That would indicate either you are running a large flat L2 network (which is a bad idea) or something else is causing excessive discovery traffic on the network. You could try disabling discovery on the switch temporarily to see if the reboots stop.


Thanks for getting back to me Eric. I think you may be onto something here! Ok I admit I'm guilty of running a large flat L2 network... would you call 1200 devices 'large'?! i'm guessing yes :P

I think that memory use seems to creep up by 5-10 MB per day and then when it is around 128 MB, the switch is at risk of rebooting. I've tried to set up memory SNMP charting in PRTG but I don't think it's present in the Netonix MIB. Am I right in thinking it's therefore not possible?

In the Discovery section of the Netonix configuration screen we normally have checked: Ubiquity, Cisco and Discovery Tab. Would you recommend unchecking all of them?

The UBNT Discovery tool seems very flaky these days so it's probably no big loss to lose the Discovery feature. Will 'Add Devices' in Netonix Manager stop finding new devices though?

Thanks
Glenn

Re: Switch reboots - High mem usage

Posted: Sun Oct 28, 2018 3:16 am
by Kingsley
Eric, did you even read the the original posts? I clearly stated discovery was disabled. I also stated we run a bit of L2, well managed and I don't need advice on network topology. Suggesting it's my network topology causing the fault is ridiculous. I am a huge advocate for Netnoix and have deployed literally 100's into networks we have designed/built and run 50 odd in our own network.

You also state that the memory usage is normal, then state discovery usage is high, which is it?

We have only started encountering this issue recently, historically we have seen 2 years up time on these same switches.

There are about 6 we are having random reboots on, all displaying similar results of maxing memory for a couple or weeks, then rebooting. We do see the same memory usage creeping up on a number of other switches so I suspect more are going to be rebooting.

Re: Switch reboots - High mem usage

Posted: Sun Oct 28, 2018 8:08 am
by Banana Jack
Kingsley wrote:Eric, did you even read the the original posts?


Hi Kingsley - I'm pretty sure Eric's first reply was to you and his second to me. Sorry for causing confusion by jumping in, but I thought our issues were similar so it made sense to keep to the same thread. I'm now thinking that although we're both suffering from reboots possibly caused by high memory usage, the root causes might not be the same.

For me, I'm going to try upgrading to 1.5.1rc7 as suggested by sirhc in the Option to disable Watchdog Reboots thread ("There are some enhancements to prevent some types of watchdog reboots such as from a LOOP (packet storm) in v1.5.1rc7, you should try v1.5.1rc7 is your think your switches are rebooting possibly from a LOOP.").

If that doesn't work then I'll disable CDP discovery and the Discovery Tab on all switches.

If that doesn't work then I'll try downgrading to 1.4.9 since the problem didn't occur on that version, but it may be unrelated co-incidence.

If that doesn't work then I'll disable UBNT Discovery but it would be a shame because it's nice to be able to easily keep track of all Netonix devices on the network.

Glenn