Netonix Keeps Rebooting on 1.5.0
-
mhoppes - Associate
- Posts: 664
- Joined: Thu Apr 10, 2014 9:14 pm
- Location: Pennsylvania
- Has thanked: 10 times
- Been thanked: 125 times
Re: Netonix Keeps Rebooting on 1.5.0
Ok. So the memory exhaustion issue is something other than just a log file growing. I’ll dig further.
-
mhoppes - Associate
- Posts: 664
- Joined: Thu Apr 10, 2014 9:14 pm
- Location: Pennsylvania
- Has thanked: 10 times
- Been thanked: 125 times
Re: Netonix Keeps Rebooting on 1.5.0
GOT IT! After how many months? I finally managed to capture the reboot.... here is what showed on the console -- this one is above my pay scale, but it looks like the kernel is going into some kind of panic? Keep in mind this is the second switch installed at this location - so the chance of this being some kind of hardware issue is pretty low.
So... words of wisdom? This happens on a regular basis at this site. I've upgraded firmware, swapped switches, turned on and off services. Added rules to who can manage the switch. No other switch on my network has this issue except at this site -- so I now submit to the powers that be the kernel information for resolution.
- Code: Select all
Apr 18 08:27:13 kernel: skb_over_panic: text:c105eee0 len:1908 put:120 head:84e6e000 data:84e6e008 tail:0x84e6e77c end:0x84e6e760 dev:eth0
Apr 18 08:27:13 kernel: skb_over_panic: text:c105eee0 len:1908 put:120 head:84f72000 data:84f72008 tail:0x84f7277c end:0x84f72760 dev:eth0
Apr 18 08:27:13 kernel: Unhandled kernel unaligned access[#1]:
Apr 18 08:27:13 kernel: Cpu 0
Apr 18 08:27:13 kernel: $ 0 : 00000000 00000000 00010000 84e6e760
Apr 18 08:27:13 kernel: $ 4 : 00010101 00010000 80365120 00010000
Apr 18 08:27:13 kernel: $ 8 : 80365120 84fca86c 00000000 00000000
Apr 18 08:27:13 kernel: $12 : 00000000 00000000 00000000 00000000
Apr 18 08:27:13 kernel: $16 : 87e51780 84fca000 00000780 84fca000
Apr 18 08:27:13 kernel: $20 : 00000000 00000000 00000000 000000d0
Apr 18 08:27:13 kernel: $24 : 00000000 2ac425b0
Apr 18 08:27:13 kernel: $28 : 87c2e000 87c2fec8 00000000 80233aac
Apr 18 08:27:13 kernel: Hi : 0009606d
Apr 18 08:27:13 kernel: Lo : abc65700
Apr 18 08:27:13 kernel: epc : 80231954 skb_clone_fraglist+0x28/0x80 Tainted: P
Apr 18 08:27:13 kernel: ra : 80233aac pskb_expand_head+0x13c/0x1e8
Apr 18 08:27:13 kernel: Status: 11008403 KERNEL EXL IE
Apr 18 08:27:13 kernel: Cause : 00800010
Apr 18 08:27:13 kernel: BadVA : 000101a5
Apr 18 08:27:13 kernel: PrId : 02019654 (MIPS 24K)
Apr 18 08:27:13 kernel: Modules linked in: i2c_vcoreiii i2c_dev i2c_core ipt_MASQUERADE iptable_nat nf_nat xt_state nf_conntrack_ipv4 nf_conntrack xt_TCPMSS ipt_LOG xt_multiport xt_mac xt_limit iptable_mangle vtss_wdt vtss_ethdrv(P) vtss_port(P) vtss_
Apr 18 08:27:13 kernel: Process events/0 (pid: 5, threadinfo=87c2e000, task=87c21100, tls=00000000)
Apr 18 08:27:13 kernel: Stack : 00000000 8011c998 87c21128 00000001 87cfdb60 87e51780 ffffffe4 00000001
Apr 18 08:27:13 kernel: 00000000 8024dbfc 87c219a8 8036b888 80119c18 80119c18 80118798 802dcea0
Apr 18 08:27:13 kernel: 87cfdb60 c105e7e4 00000000 00000000 00000000 00000000 00000000 00000000
Apr 18 08:27:13 kernel: 00000000 c105e83c 87d26800 029e76a5 00000000 802dcecc 000000d0 c1065348
Apr 18 08:27:13 kernel: 87c18900 8013060c 87c18908 87c18900 00000000 00000000 87c18908 87c18900
Apr 18 08:27:13 kernel: ...
Apr 18 08:27:13 kernel: Call Trace:
Apr 18 08:27:13 kernel: [<80231954>] skb_clone_fraglist+0x28/0x80
Apr 18 08:27:13 kernel: [<80233aac>] pskb_expand_head+0x13c/0x1e8
Apr 18 08:27:13 kernel: [<8024dbfc>] netlink_broadcast+0xb4/0x58c
Apr 18 08:27:13 kernel: [<c105e83c>] fdma_ccm_init+0x83c/0x105c [vtss_ethdrv]
Apr 18 08:27:13 kernel: Code: 00451024 10400008 00000000 <c08200a4> 24420001 e08200a4 10400fd9 00000000 0808c667
So... words of wisdom? This happens on a regular basis at this site. I've upgraded firmware, swapped switches, turned on and off services. Added rules to who can manage the switch. No other switch on my network has this issue except at this site -- so I now submit to the powers that be the kernel information for resolution.
-
Stephen - Employee
- Posts: 1033
- Joined: Sun Dec 24, 2017 8:56 pm
- Has thanked: 85 times
- Been thanked: 181 times
Re: Netonix Keeps Rebooting on 1.5.0
Will start investigating the code now via the stack trace in the kernel panic. Was this on 1.5.2?
-
mhoppes - Associate
- Posts: 664
- Joined: Thu Apr 10, 2014 9:14 pm
- Location: Pennsylvania
- Has thanked: 10 times
- Been thanked: 125 times
Re: Netonix Keeps Rebooting on 1.5.0
This was on 1.4.9 (no trace on that - but had the reboots) and now on 1.5.0. I'm not opposed to upgrading to 1.5.2 -- but only if there's assurance that whatever caused this is actually fixed.. otherwise I'm just upgrading because "it might fix it" and we're just left wondering.
Thanks!
Thanks!
-
Stephen - Employee
- Posts: 1033
- Joined: Sun Dec 24, 2017 8:56 pm
- Has thanked: 85 times
- Been thanked: 181 times
Re: Netonix Keeps Rebooting on 1.5.0
I've never seen a kernel panic on the system before so I can't say that 1.5.2 fixes this specifically. There was a memory leak in the discovery feature that 1.5.2 attempts to correct. However, honestly, this looks like something different. But what it would do is if you see it again it would help confirm that we're not chasing ghost's on the new codebase.
Either way I'll see if I can hunt it down.
Either way I'll see if I can hunt it down.
-
Stephen - Employee
- Posts: 1033
- Joined: Sun Dec 24, 2017 8:56 pm
- Has thanked: 85 times
- Been thanked: 181 times
Re: Netonix Keeps Rebooting on 1.5.0
Not yet, working on different ways to see if I can cause the same result.
So far no luck, it's doing it's best to stay as remarkably stable as it can when I'm watching it.
I have some automated tests running as I type this that will automatically alert me when this particular event occurs, but while that's going on I have to work on other issue's on the old switches, netonix manager, and the newer switches coming out as well.
Any and all details you can provide are always helpful, logs up to the point of failure (possibly collected from a syslog server in this case), topology details, active service's etc.
So far no luck, it's doing it's best to stay as remarkably stable as it can when I'm watching it.
I have some automated tests running as I type this that will automatically alert me when this particular event occurs, but while that's going on I have to work on other issue's on the old switches, netonix manager, and the newer switches coming out as well.
Any and all details you can provide are always helpful, logs up to the point of failure (possibly collected from a syslog server in this case), topology details, active service's etc.
-
mhoppes - Associate
- Posts: 664
- Joined: Thu Apr 10, 2014 9:14 pm
- Location: Pennsylvania
- Has thanked: 10 times
- Been thanked: 125 times
Re: Netonix Keeps Rebooting on 1.5.0
That's the problem -- there's literally nothing in the log other than memory goes to 100% and the kernel panics... but there's no other errors or other information.
-
Stephen - Employee
- Posts: 1033
- Joined: Sun Dec 24, 2017 8:56 pm
- Has thanked: 85 times
- Been thanked: 181 times
Re: Netonix Keeps Rebooting on 1.5.0
If my tests keep failing to re-create this I'll see if I can add in something that will help log this error better.
-
mhoppes - Associate
- Posts: 664
- Joined: Thu Apr 10, 2014 9:14 pm
- Location: Pennsylvania
- Has thanked: 10 times
- Been thanked: 125 times
Re: Netonix Keeps Rebooting on 1.5.0
Thanks. Keep in mind this kernel panic happens about once every 14 days to 60 days. I can't even tell you how to re-create the error to help as I have no idea what is causing it.
I thought the switch core ran separately from the software GUI? Why would a kernel panic cause the core to also reboot?
I thought the switch core ran separately from the software GUI? Why would a kernel panic cause the core to also reboot?
Who is online
Users browsing this forum: Google [Bot] and 40 guests