Well, when the OS is reloaded it has to first remove all processes from memory and reload them just like during a normal boot sequence.
Although we do not control the details of the actual protocols in depth on the switch core because a lot of it is built directly into it's hardware, one of the processes interfaces to the switch core and configures all the protocols to either their default or user configured value's so that it can operate properly.
From my point of view (as a developer) the switch core is essentially a slave device that is totally dependent on the actions of the CPU side of this architecture. Metaphorically, it would be like restarting your computer and seeing the monitor go blank for a second. The monitor itself doesn't have enough intelligence to do anything on it's own so it will just go blue until the computer can talk to it again. And more accurately the switch core is more dependent than that as it has no internal programmable memory or processing that can be cross-referenced without interference from this process I mentioned earlier. The process itself was actually constructed by the manufacturer we use and has been edited by us to mesh easier with the GUI so you guys don't have to worry about where every bit in the system is when your trying to do something like setting up a VLAN. (This is in reality pretty much how all SoC devices work these days with variations on the architecture)
Anyway, since it is happening at a semi-regular interval it seems logical that this is some sort of a memory leak that builds up and breaks a buffer or a stack somewhere which isolates things a bit. But from the trace provided in the crash details of the kernel panic it points to a nic driver as the code failure point. Which is something built into the underline OS and not something we've ever touched. Point being that obviously, it doesn't really point me in the right direction. So I have to be able to re-create the issue, or get a more detailed report in order to trace it back to something I can touch.
It's still possible this is a hardware issue, but I kind of doubt it. Maybe an SFP module that is just slighty short of being compatible or something but I really think it's probably a leak in the underline code somewhere.
Netonix Keeps Rebooting on 1.5.0
-
mhoppes - Associate
- Posts: 664
- Joined: Thu Apr 10, 2014 9:14 pm
- Location: Pennsylvania
- Has thanked: 10 times
- Been thanked: 125 times
Re: Netonix Keeps Rebooting on 1.5.0
Hi Stephen,
Just curious if you've made any head way on this? Oddly enough our switch has not rebooted since I last captured that core dump for you.
Just curious if you've made any head way on this? Oddly enough our switch has not rebooted since I last captured that core dump for you.
Re: Netonix Keeps Rebooting on 1.5.0
Following thread as I'm seeing a similar issue
- justo
- Member
- Posts: 2
- Joined: Fri Dec 30, 2016 5:41 am
- Has thanked: 0 time
- Been thanked: 0 time
Re: Netonix Keeps Rebooting on 1.5.0
Has there been any movement on this one? It seems with 1.5.2 we are getting the same Kernal panic from almost every Netonix switch we have. This seems to happen anywhere between 20 days to 180days of uptime.
Let me know if i can supply and info.
Let me know if i can supply and info.
Re: Netonix Keeps Rebooting on 1.5.0
I'm here because we've got a switch now on Netonix 1.5.3rc1 that's rebooting. Seven events in 59 days.
-
mhoppes - Associate
- Posts: 664
- Joined: Thu Apr 10, 2014 9:14 pm
- Location: Pennsylvania
- Has thanked: 10 times
- Been thanked: 125 times
Re: Netonix Keeps Rebooting on 1.5.0
We just had the kernel panic happen again last night on the same core switch. Thankfully it happened at4 but I am very close to ripping it out of the network and replacing it with a different vendor.
Any progress?
Any progress?
-
sirhc - Employee
- Posts: 7416
- Joined: Tue Apr 08, 2014 3:48 pm
- Location: Lancaster, PA
- Has thanked: 1608 times
- Been thanked: 1325 times
Re: Netonix Keeps Rebooting on 1.5.0
The only thing I can suggest Matt is to make the switch somehow accessible to our programmers (Stephen and Eric and possibly me) making sure to use the Access Control list so that only you and them can access it.
Might be a good idea to implement the Access Control list anyway as maybe someone is maliciously attacking this unit? Even if it is an un-rotatable IP they could be spring boarding to it from within your network?
If you have tried everything you can such as disabling every service possible (STP and such) including SNMP and the Manager and still see it I am not sure what else to say but we can not seem to replicate this and have taken several stabs in the dark with apparently no success.
Might be a good idea to implement the Access Control list anyway as maybe someone is maliciously attacking this unit? Even if it is an un-rotatable IP they could be spring boarding to it from within your network?
If you have tried everything you can such as disabling every service possible (STP and such) including SNMP and the Manager and still see it I am not sure what else to say but we can not seem to replicate this and have taken several stabs in the dark with apparently no success.
Support is handled on the Forums not in Emails and PMs.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.
-
Stephen - Employee
- Posts: 1033
- Joined: Sun Dec 24, 2017 8:56 pm
- Has thanked: 85 times
- Been thanked: 181 times
Re: Netonix Keeps Rebooting on 1.5.0
Matt, although it has been difficult to penetrate this issue because a kernel panic can mean almost anything. There has been progress in other area's that might be related. One of our other client's has an issue with memory growth that seems to be specific to his environment that has been partially solved. I can send you this test firmware if you'd like to see if it make's a difference on your side. Honestly any change's or information would help right now. I know we've only got so much to work with though.
And like sirhc said, this could be the work of a hacker. Kernel panic's can result from a failed attempt at injecting shellcode which is used to obtain root on a target system.
Let me know if you'd like me to send you the test firmware.
And like sirhc said, this could be the work of a hacker. Kernel panic's can result from a failed attempt at injecting shellcode which is used to obtain root on a target system.
Let me know if you'd like me to send you the test firmware.
-
mhoppes - Associate
- Posts: 664
- Joined: Thu Apr 10, 2014 9:14 pm
- Location: Pennsylvania
- Has thanked: 10 times
- Been thanked: 125 times
Re: Netonix Keeps Rebooting on 1.5.0
We are using private IPs and access lists. A Kernel panic would only happen if someone is attempting to hijack code that is improperly written or not closing something (e.g. memory leak). Otherwise it should be handling it gracefully, shouldn't it?
-
Stephen - Employee
- Posts: 1033
- Joined: Sun Dec 24, 2017 8:56 pm
- Has thanked: 85 times
- Been thanked: 181 times
Re: Netonix Keeps Rebooting on 1.5.0
Correct, when they hijack code, the next thing they do is inject the shellcode to obtain control. If it fails by going to a memory location it's not suppose to for example, it can result in a panic. But in general, a panic just means that something tried to access a memory space it wasn't suppose to in general. Like a run-away memory buffer. This is why I suggest trying to test out the firmware for the other client. His issue appears to be related to a memory leak which existed in net-snmp. (I'm still working on this, but it appears the issue in net-snmp is at least partially solved) It's not unreasonable that they are related and maybe the difference in through-put on your system's might be why the issue is manifesting differently.
Who is online
Users browsing this forum: Google [Bot] and 41 guests