Unexpected switch reboots
- Beanhammer
- Member
- Posts: 10
- Joined: Tue Aug 16, 2016 10:38 am
- Has thanked: 0 time
- Been thanked: 0 time
- craig.moscardini
- Member
- Posts: 14
- Joined: Tue Nov 05, 2019 6:14 am
- Has thanked: 0 time
- Been thanked: 1 time
Re: Unexpected switch reboots
Is there any news on this? I haven't heard from anyone in a while
- craig.moscardini
- Member
- Posts: 14
- Joined: Tue Nov 05, 2019 6:14 am
- Has thanked: 0 time
- Been thanked: 1 time
Re: Unexpected switch reboots
What's happening with this?
I've tried to be helpful and can provide any diagnostic info you need but I'm not getting any response now either publicly or via PM.
There is a clear memory leak on 1.5.5 relating to one of the below processes. We're working through turning off discovery, SNMP, Syslog etc to find which is the issue but so far have failed. I'm coming to the conclusion that it's irrelevant which services are running.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
857 admin 20 0 92412 84m 984 S 0.3 68.4 113:28.24 RX thread
859 admin 20 0 92412 84m 984 S 0.0 68.4 0:00.87 event thread
860 admin 20 0 92412 84m 984 S 1.6 68.4 216:46.75 discovery threa
862 admin -2 0 92412 84m 984 S 0.0 68.4 0:00.30 ubnt discovery
867 admin 20 0 92412 84m 984 S 0.0 68.4 0:54.63 erps
868 admin 20 0 92412 84m 984 S 0.0 68.4 0:53.54 mstp_thread
881 admin 20 0 92412 84m 984 S 0.0 68.4 56:13.11 vtss_appl
895 admin 20 0 92412 84m 984 S 0.0 68.4 98:14.83 port thread
871 admin 20 0 92412 84m 984 S 0.0 68.4 0:00.03 lacp_thread
880 admin 20 0 92412 84m 984 S 0.0 68.4 0:00.00 vtss_appl
896 admin 20 0 92412 84m 984 S 3.9 68.4 1646:32 status_thread
855 admin -3 0 92412 84m 984 S 0.0 68.4 0:00.05 vtss_appl
848 admin 20 0 92412 84m 984 S 0.0 68.4 0:06.10 vtss_appl
I've tried to be helpful and can provide any diagnostic info you need but I'm not getting any response now either publicly or via PM.
There is a clear memory leak on 1.5.5 relating to one of the below processes. We're working through turning off discovery, SNMP, Syslog etc to find which is the issue but so far have failed. I'm coming to the conclusion that it's irrelevant which services are running.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
857 admin 20 0 92412 84m 984 S 0.3 68.4 113:28.24 RX thread
859 admin 20 0 92412 84m 984 S 0.0 68.4 0:00.87 event thread
860 admin 20 0 92412 84m 984 S 1.6 68.4 216:46.75 discovery threa
862 admin -2 0 92412 84m 984 S 0.0 68.4 0:00.30 ubnt discovery
867 admin 20 0 92412 84m 984 S 0.0 68.4 0:54.63 erps
868 admin 20 0 92412 84m 984 S 0.0 68.4 0:53.54 mstp_thread
881 admin 20 0 92412 84m 984 S 0.0 68.4 56:13.11 vtss_appl
895 admin 20 0 92412 84m 984 S 0.0 68.4 98:14.83 port thread
871 admin 20 0 92412 84m 984 S 0.0 68.4 0:00.03 lacp_thread
880 admin 20 0 92412 84m 984 S 0.0 68.4 0:00.00 vtss_appl
896 admin 20 0 92412 84m 984 S 3.9 68.4 1646:32 status_thread
855 admin -3 0 92412 84m 984 S 0.0 68.4 0:00.05 vtss_appl
848 admin 20 0 92412 84m 984 S 0.0 68.4 0:06.10 vtss_appl
-
mike99 - Associate
- Posts: 837
- Joined: Tue Nov 25, 2014 10:53 am
- Location: Quebec, Canada
- Has thanked: 95 times
- Been thanked: 245 times
Re: Unexpected switch reboots
I don't see any proof of memory leak on your posts. You should install a monitoring software to monitor CPU and memory before calming a memory issue. Anyway, memory leak would have good chance to affect both AC and DC. Seem more hardware. Could you post pics of your setup ? A schema of your setup could help.
- craig.moscardini
- Member
- Posts: 14
- Joined: Tue Nov 05, 2019 6:14 am
- Has thanked: 0 time
- Been thanked: 1 time
Re: Unexpected switch reboots
Thanks @mike99 but I've had extensive direct communication with Stephen but he is now ignoring me. I can provide graphs from SNMP data for 100+ switches showing the same issue. We do actually track memory usage so I do know what I'm talking about. I've provided numerous log extracts and config files. The TOP view I posted earlier clearly shows 84mb memory usage by a process. Overall free memory at that time was under 10mb on a switch with 128mb. I can assure you 100% that this is a memory issue, and at some point close to 0% free memory the switch will reboot itself, or if you try to log in to the GUI shortly before then it will hang for 5-10 minutes then reboot itself.
I'm trying my best to be helpful here but it appears no one at Netonix actually wants our help
I'm trying my best to be helpful here but it appears no one at Netonix actually wants our help
-
Stephen - Employee
- Posts: 1033
- Joined: Sun Dec 24, 2017 8:56 pm
- Has thanked: 85 times
- Been thanked: 181 times
Re: Unexpected switch reboots
Hi craig,
Just went through my PM's and unfortunately I missed the last one you sent me as I had gotten an influx of PM's and other request's from higher up at the same time and you just got lost in the void. I apologize.
I have still not been able to replicate this but after seeing your graph I'm more convinced you're being plagued with the same problem that I've been trying to hunt down with Ludvick for some time now, for reference check here.
If you look at the other thread with what I think is the same issue, my most recent guess as to what is causing this was that SNMP itself was the source of the trouble. It doesn't affect everyone though so I think it may have something to do with the interaction between SNMP and the management software being used or the specific OID involving memory (there was a confirmed fix for a bug there already, that had to do with an incompatibility with net-snmp and our OS on the switch, more about that here).
Something that might help me help you now though is what software are you using to monitor SNMP on the affected switches? And how is it configured?
Ludvick, when you see this here that is another detail that you could post here that might help verify if these two issue's are one and the same. If they are, having another point of reference might also help isolate whatever has been causing this.
Just went through my PM's and unfortunately I missed the last one you sent me as I had gotten an influx of PM's and other request's from higher up at the same time and you just got lost in the void. I apologize.
I have still not been able to replicate this but after seeing your graph I'm more convinced you're being plagued with the same problem that I've been trying to hunt down with Ludvick for some time now, for reference check here.
If you look at the other thread with what I think is the same issue, my most recent guess as to what is causing this was that SNMP itself was the source of the trouble. It doesn't affect everyone though so I think it may have something to do with the interaction between SNMP and the management software being used or the specific OID involving memory (there was a confirmed fix for a bug there already, that had to do with an incompatibility with net-snmp and our OS on the switch, more about that here).
Something that might help me help you now though is what software are you using to monitor SNMP on the affected switches? And how is it configured?
Ludvick, when you see this here that is another detail that you could post here that might help verify if these two issue's are one and the same. If they are, having another point of reference might also help isolate whatever has been causing this.
- Ludvik
- Experienced Member
- Posts: 105
- Joined: Tue Nov 08, 2016 1:50 pm
- Has thanked: 15 times
- Been thanked: 15 times
Re: Unexpected switch reboots
I have SNMP turned off on one switch and monitor (by my eyes) the used memory.
viewtopic.php?f=17&t=4788&start=70#p32298
IMHO trouble is discovery protocols.
viewtopic.php?f=17&t=4788&start=70#p32298
IMHO trouble is discovery protocols.
-
mike99 - Associate
- Posts: 837
- Joined: Tue Nov 25, 2014 10:53 am
- Location: Quebec, Canada
- Has thanked: 95 times
- Been thanked: 245 times
Re: Unexpected switch reboots
Discovery protocols are a possibility. We disabled it on all switch because of performance issue. It was not on all site. I remember only one site with doubt on some others but we still now disable it on every site and only enable it if needed, not all the time.
Also, Ubnt discovery is vulnerable to amplification attack. If antenna or edgerouter have public IP, it could be related.
Also, Ubnt discovery is vulnerable to amplification attack. If antenna or edgerouter have public IP, it could be related.
- craig.moscardini
- Member
- Posts: 14
- Joined: Tue Nov 05, 2019 6:14 am
- Has thanked: 0 time
- Been thanked: 1 time
Re: Unexpected switch reboots
I've had the chance over the past few days to play with this issue again. It appears that disabling all discovery options resolves the memory issue but doing that after the issue has started has no affect. We're still monitoring at the moment to ensure this fully prevents the reboot. At the moment though from what I see there needs to be more work done to find out why those discovery options cause the issue, and if there is a way to prevent the issue getting so bad the switch has to reboot itself to recover. Can there not be some detection built in so that the switch kills the process before the reboot happens? Even logging the watchdog reboot as being due to a lack of memory would be a good start
Who is online
Users browsing this forum: No registered users and 50 guests