Netonix Keeps Rebooting on 1.5.0

DOWNLOAD THE LATEST FIRMWARE HERE
User avatar
mhoppes
Associate
Associate
 
Posts: 664
Joined: Thu Apr 10, 2014 9:14 pm
Location: Pennsylvania
Has thanked: 10 times
Been thanked: 125 times

Re: Netonix Keeps Rebooting on 1.5.0

Tue Feb 05, 2019 4:26 pm

Ok. So the memory exhaustion issue is something other than just a log file growing. I’ll dig further.

User avatar
mhoppes
Associate
Associate
 
Posts: 664
Joined: Thu Apr 10, 2014 9:14 pm
Location: Pennsylvania
Has thanked: 10 times
Been thanked: 125 times

Re: Netonix Keeps Rebooting on 1.5.0

Thu Apr 18, 2019 8:46 am

GOT IT! After how many months? I finally managed to capture the reboot.... here is what showed on the console -- this one is above my pay scale, but it looks like the kernel is going into some kind of panic? Keep in mind this is the second switch installed at this location - so the chance of this being some kind of hardware issue is pretty low.

Code: Select all
 Apr 18 08:27:13 kernel: skb_over_panic: text:c105eee0 len:1908 put:120 head:84e6e000 data:84e6e008 tail:0x84e6e77c end:0x84e6e760 dev:eth0
Apr 18 08:27:13 kernel: skb_over_panic: text:c105eee0 len:1908 put:120 head:84f72000 data:84f72008 tail:0x84f7277c end:0x84f72760 dev:eth0
Apr 18 08:27:13 kernel: Unhandled kernel unaligned access[#1]:
Apr 18 08:27:13 kernel: Cpu 0
Apr 18 08:27:13 kernel: $ 0   : 00000000 00000000 00010000 84e6e760
Apr 18 08:27:13 kernel: $ 4   : 00010101 00010000 80365120 00010000
Apr 18 08:27:13 kernel: $ 8   : 80365120 84fca86c 00000000 00000000
Apr 18 08:27:13 kernel: $12   : 00000000 00000000 00000000 00000000
Apr 18 08:27:13 kernel: $16   : 87e51780 84fca000 00000780 84fca000
Apr 18 08:27:13 kernel: $20   : 00000000 00000000 00000000 000000d0
Apr 18 08:27:13 kernel: $24   : 00000000 2ac425b0                 
Apr 18 08:27:13 kernel: $28   : 87c2e000 87c2fec8 00000000 80233aac
Apr 18 08:27:13 kernel: Hi    : 0009606d
Apr 18 08:27:13 kernel: Lo    : abc65700
Apr 18 08:27:13 kernel: epc   : 80231954 skb_clone_fraglist+0x28/0x80     Tainted: P         
Apr 18 08:27:13 kernel: ra    : 80233aac pskb_expand_head+0x13c/0x1e8
Apr 18 08:27:13 kernel: Status: 11008403    KERNEL EXL IE
Apr 18 08:27:13 kernel: Cause : 00800010
Apr 18 08:27:13 kernel: BadVA : 000101a5
Apr 18 08:27:13 kernel: PrId  : 02019654 (MIPS 24K)
Apr 18 08:27:13 kernel: Modules linked in: i2c_vcoreiii i2c_dev i2c_core ipt_MASQUERADE iptable_nat nf_nat xt_state nf_conntrack_ipv4 nf_conntrack xt_TCPMSS ipt_LOG xt_multiport xt_mac xt_limit iptable_mangle vtss_wdt vtss_ethdrv(P) vtss_port(P) vtss_
Apr 18 08:27:13 kernel: Process events/0 (pid: 5, threadinfo=87c2e000, task=87c21100, tls=00000000)
Apr 18 08:27:13 kernel: Stack : 00000000 8011c998 87c21128 00000001 87cfdb60 87e51780 ffffffe4 00000001
Apr 18 08:27:13 kernel:         00000000 8024dbfc 87c219a8 8036b888 80119c18 80119c18 80118798 802dcea0
Apr 18 08:27:13 kernel:         87cfdb60 c105e7e4 00000000 00000000 00000000 00000000 00000000 00000000
Apr 18 08:27:13 kernel:         00000000 c105e83c 87d26800 029e76a5 00000000 802dcecc 000000d0 c1065348
Apr 18 08:27:13 kernel:         87c18900 8013060c 87c18908 87c18900 00000000 00000000 87c18908 87c18900
Apr 18 08:27:13 kernel:         ...
Apr 18 08:27:13 kernel: Call Trace:
Apr 18 08:27:13 kernel: [<80231954>] skb_clone_fraglist+0x28/0x80
Apr 18 08:27:13 kernel: [<80233aac>] pskb_expand_head+0x13c/0x1e8
Apr 18 08:27:13 kernel: [<8024dbfc>] netlink_broadcast+0xb4/0x58c
Apr 18 08:27:13 kernel: [<c105e83c>] fdma_ccm_init+0x83c/0x105c [vtss_ethdrv]
Apr 18 08:27:13 kernel: Code: 00451024  10400008  00000000 <c08200a4> 24420001  e08200a4  10400fd9  00000000  0808c667
 




So... words of wisdom? This happens on a regular basis at this site. I've upgraded firmware, swapped switches, turned on and off services. Added rules to who can manage the switch. No other switch on my network has this issue except at this site -- so I now submit to the powers that be the kernel information for resolution.

User avatar
Stephen
Employee
Employee
 
Posts: 1033
Joined: Sun Dec 24, 2017 8:56 pm
Has thanked: 85 times
Been thanked: 181 times

Re: Netonix Keeps Rebooting on 1.5.0

Thu Apr 18, 2019 3:16 pm

Will start investigating the code now via the stack trace in the kernel panic. Was this on 1.5.2?

User avatar
mhoppes
Associate
Associate
 
Posts: 664
Joined: Thu Apr 10, 2014 9:14 pm
Location: Pennsylvania
Has thanked: 10 times
Been thanked: 125 times

Re: Netonix Keeps Rebooting on 1.5.0

Thu Apr 18, 2019 3:17 pm

This was on 1.4.9 (no trace on that - but had the reboots) and now on 1.5.0. I'm not opposed to upgrading to 1.5.2 -- but only if there's assurance that whatever caused this is actually fixed.. otherwise I'm just upgrading because "it might fix it" and we're just left wondering.

Thanks!

User avatar
Stephen
Employee
Employee
 
Posts: 1033
Joined: Sun Dec 24, 2017 8:56 pm
Has thanked: 85 times
Been thanked: 181 times

Re: Netonix Keeps Rebooting on 1.5.0

Thu Apr 18, 2019 3:24 pm

I've never seen a kernel panic on the system before so I can't say that 1.5.2 fixes this specifically. There was a memory leak in the discovery feature that 1.5.2 attempts to correct. However, honestly, this looks like something different. But what it would do is if you see it again it would help confirm that we're not chasing ghost's on the new codebase.

Either way I'll see if I can hunt it down.

User avatar
mhoppes
Associate
Associate
 
Posts: 664
Joined: Thu Apr 10, 2014 9:14 pm
Location: Pennsylvania
Has thanked: 10 times
Been thanked: 125 times

Re: Netonix Keeps Rebooting on 1.5.0

Tue Apr 23, 2019 9:38 am

Hi Stephen,
Any update on this?

User avatar
Stephen
Employee
Employee
 
Posts: 1033
Joined: Sun Dec 24, 2017 8:56 pm
Has thanked: 85 times
Been thanked: 181 times

Re: Netonix Keeps Rebooting on 1.5.0

Tue Apr 23, 2019 10:34 am

Not yet, working on different ways to see if I can cause the same result.
So far no luck, it's doing it's best to stay as remarkably stable as it can when I'm watching it.
I have some automated tests running as I type this that will automatically alert me when this particular event occurs, but while that's going on I have to work on other issue's on the old switches, netonix manager, and the newer switches coming out as well.

Any and all details you can provide are always helpful, logs up to the point of failure (possibly collected from a syslog server in this case), topology details, active service's etc.

User avatar
mhoppes
Associate
Associate
 
Posts: 664
Joined: Thu Apr 10, 2014 9:14 pm
Location: Pennsylvania
Has thanked: 10 times
Been thanked: 125 times

Re: Netonix Keeps Rebooting on 1.5.0

Tue Apr 23, 2019 10:42 am

That's the problem -- there's literally nothing in the log other than memory goes to 100% and the kernel panics... but there's no other errors or other information.

User avatar
Stephen
Employee
Employee
 
Posts: 1033
Joined: Sun Dec 24, 2017 8:56 pm
Has thanked: 85 times
Been thanked: 181 times

Re: Netonix Keeps Rebooting on 1.5.0

Tue Apr 23, 2019 12:19 pm

If my tests keep failing to re-create this I'll see if I can add in something that will help log this error better.

User avatar
mhoppes
Associate
Associate
 
Posts: 664
Joined: Thu Apr 10, 2014 9:14 pm
Location: Pennsylvania
Has thanked: 10 times
Been thanked: 125 times

Re: Netonix Keeps Rebooting on 1.5.0

Tue Apr 23, 2019 12:21 pm

Thanks. Keep in mind this kernel panic happens about once every 14 days to 60 days. I can't even tell you how to re-create the error to help as I have no idea what is causing it.

I thought the switch core ran separately from the software GUI? Why would a kernel panic cause the core to also reboot?

PreviousNext
Return to Hardware and software issues

Who is online

Users browsing this forum: Google [Bot] and 29 guests