Potential bugs in WS3 firmware causing strange behaviour

DOWNLOAD THE LATEST FIRMWARE HERE
User avatar
Stephen
Employee
Employee
 
Posts: 1033
Joined: Sun Dec 24, 2017 8:56 pm
Has thanked: 85 times
Been thanked: 181 times

Re: Potential bugs in WS3 firmware causing strange behaviour

Thu Jun 03, 2021 12:46 pm

Garet wrote:On every boot now the switch locks up for approximate 10 minutes. I call this a lock up as unlike the normal boot process the switch consistently errors after the lockup clears. The error is always the previously mentioned PHP error repeated many times. After which the CLI presents as if nothing happened. This first occurred after I rebooted the switch via the RS232 interface.


OK that is definitely not correct behavior, the bootup cycle does take a few minutes, but not 10 so that's definitely anomalous.

Garet wrote:1. Switch arrives from RMA
2. Switch boots normally, all ports negotiate 1G, unable to access management console, connected directly to switch with
an adapter that has a static IP on the same net as switch.
3. Enabled DHCP, can now connect to switch from it's IP but only on a Windows machine

4. Performed Bench test (see previous posts)
5.. Noticed switch would not acquire an address via DHCP
5.1. Manually set address to a static IP via RS232. switch did not take IP and did not revert to it's default static
5.2. Rebooted switch from CLI
5.3. Switch sat locked up for several minutes however character echo back over RS232 was still functional
5.4. Switch finally booted, this was the first time the PHP error occurred.
5.5. Switch still did not acquire a DHCP address or revert to default static IP
5.6. Attempted reboot via DEF button (green circle), same issue as 5.3 to 5.4


That's helpful, I will try this order of events and see what happens.

My guess is around 5.1 when the switch refused to take the static IP change via the CLI something got corrupted hence why it failed to revert and had continuous problems following. Like you say, it might have been a something like a set mutex waiting for something else to finish that prevented the entered IP value from making it to the config file which may have nulled the IP in the config causing this havoc. However, if that's correct, then the the defaulting process is still a separate problem as it should have fixed it.

Garet wrote:...I hope it's clear why I would have to be extraordinarily unlucky to corrupt something...


Regardless, this has not happened to others as far as I know. Please don't forget that 99% of the operation on the switch is done automatically. The moment it is plugged in, booted, and made aware of your specific network there are dozens of processes already working and making decisions about what action the switch should take to be as optimal at passing information as possible. Any one of these processes could have been the source of locking the config via mutex, spinlock, or (most likely) semaphore, etc and caused the corruption to occur when the IP was manually entered at step 5.1.

Right now, my guess is it was the dhcpcd process, which was having trouble getting the IP address for a presently unknown reason which may have locked the entries where IP addresses are suppose to go and when the manual attempt was done, caused the issue.

Another point is that I've done this almost the same way you have many times and never seen something like this happen before. Hence I I think it must be like a scenario above where dhcpcd or other similar process must be conflicting with entries.

Either way I'm looking into it, we are pretty backlogged right now but I'd like to get to the bottom of this one.

Garnet
Member
 
Posts: 71
Joined: Wed Jun 02, 2021 9:29 am
Has thanked: 2 times
Been thanked: 1 time

Re: Potential bugs in WS3 firmware causing strange behaviour

Thu Jun 03, 2021 1:36 pm

Thank you for all the input Stephen. I eagerly await your test results. For the moment I'm going to leave the switch as it in case you need any logs I have not already provided.

Garnet
Member
 
Posts: 71
Joined: Wed Jun 02, 2021 9:29 am
Has thanked: 2 times
Been thanked: 1 time

Re: Potential bugs in WS3 firmware causing strange behaviour

Tue Jun 08, 2021 3:09 pm

Update on this. I decided to start re configuring the switch (e.g. factory default and reconfigure) and I managed to recreate my issue. During configuration everything was normal up until the point I changed the DNS, default gateway and static IP settings. After saving the config I lost access to the management interface (IP was changed from 192.168.1.20 to 192.168.1.24). I dropped into CLI via RS232 and ran ipaddr, switch reported an IP of 0.0.0.0 (so no address). At this point I rebooted via CLI (reload cold). At this point the switch once again locked up for around 10 minutes before displaying the same PHP error I originally saw.

It appears there is a bug in the firmware that is preventing the switch from changing it's static IP when other configuration changes have been applied in the past. This would be in line with what I saw in the logs sent to Stephen. Judging by the switch behaviour there does not appear to be error handling for this case.

While I continue try to determine the root of the problem here are my scratch notes from what was done after what I have described above, points are in chronological order.

- Rebooted from CLI (reload cold)
- Switch booted fine
- Restored config backup (note GUI presented but CLI was unavailable (i.e still booting?) scratch this switch rebooted)
- PHP issue returned following restore of config via GUI (config file upload). Lost access to managment console and rebooted via CLI (PHP error after reboot)
- Factory defaulted (cold boot holding DEF button)
- Confirmed GUI was accessible
- rebooted via CLI (reload cold)
- Switch rebooted GUI accessible
- Rebooted again
- Switch rebooted GUI accessible
- changed static IP to 192.168.1.24
- successful
- Restored config
- Switch shows message "Device Address Changed"
- No GUI access
- Switch rebooted and locked up

Please note that the config file uploaded did not include any changes to DNS or IP settings (i.e IPv4 static is the stock static).

User avatar
Stephen
Employee
Employee
 
Posts: 1033
Joined: Sun Dec 24, 2017 8:56 pm
Has thanked: 85 times
Been thanked: 181 times

Re: Potential bugs in WS3 firmware causing strange behaviour

Tue Jun 08, 2021 6:08 pm

Noted, I've actually had some success seeing similar error's that you where describing by making some changes to my network and performing the steps in the previous post you made describing your steps. It looks to me that it could be a similar problem that we had seen in the past where changing the switch name would reak havoc similar to what we're seeing now. That being said it is now definitely on my list of things to fix, just as an FYI, I currently have 3/4 major items I'm working on (including a fix for the cable diagnostics) for software to be released soon. My goal is to have it out by this week, but we have more items showing up on the forums so I may be pushed back a bit longer. But regardless of what comes in after that I don't want to wait more than 2 weeks to release the next set of firmware, so whatever bugs/features I get in by then will be out and I do intend to have this issue addressed on it.

Please keep submitting information if you find more information on the problem though.

Garnet
Member
 
Posts: 71
Joined: Wed Jun 02, 2021 9:29 am
Has thanked: 2 times
Been thanked: 1 time

Re: Potential bugs in WS3 firmware causing strange behaviour

Wed Jun 09, 2021 10:01 am

Hi Stephen,

Your point about changing the switch name interested me so I did some more tests. Once again here are my scratch notes, points are in chronological order, config file used is a backup of the switch config that I manually setup during the previous set of tests, I can PM it too you if needed (I have the ncb2 file this time).

1. Factory Defaulted

2. Changed "switch name" in backup config to "Netonix Switch"

3. Restored modified backup config
3.1 Switch rebooted, settings did not change

4. Restored modified backup config again
4.1 Switch rebooted, settings did not change

5. Restored unmodified backup config
5.1 Switch locked up, after about 10 minutes it had the same PHP error

The only difference between the 'modified' and 'unmodified' config files is that I changed the "Switch_Name" field in the modified file to "Switch_Name": "Netonix Switch" in the unmodified file the field is "Switch_Name": "HT-WS3-14". You may have found part of the problem.

It's worth noting specifically that when restoring the modified config file although the switch booted properly it did not restore any of the settings in the config file.

User avatar
Stephen
Employee
Employee
 
Posts: 1033
Joined: Sun Dec 24, 2017 8:56 pm
Has thanked: 85 times
Been thanked: 181 times

Re: Potential bugs in WS3 firmware causing strange behaviour

Wed Jun 09, 2021 11:55 am

Yeah I was also sorta figuring it could be the same or a similar problem rearing it's head again. Thanks for the further notes.

Garnet
Member
 
Posts: 71
Joined: Wed Jun 02, 2021 9:29 am
Has thanked: 2 times
Been thanked: 1 time

Re: Potential bugs in WS3 firmware causing strange behaviour

Tue Jun 15, 2021 10:37 am

Another update/question. This WS3 has so far never succeeded in restoring a config file all the config restore seems to do is reboot the switch. Is it normal for the WS3 config file to have a bunch of garbage characters when viewed in Notepad.

Excerpt from the config:

line vty 14
no auth
!
line vty 15
no auth
!
!
end
switch/startup-config-created 0000644 0000000 0000000 00000000000 14037344307 015452 0 ustar admin root switch/random-seed 0001000 0000000 0000000 00000001000 14037344272 013263 0 ustar admin root ñâ‰
4Ëi™’jb×ÉD=;üýÖÙk= ’zP’ý»¶”끴Ìhy=œµ¦D;ÇQ7×¼ö‹4ƒvsŽ¬¨ˆ‰ÅRzç8Oÿ¥±út®›ÜÐï¼^-|êaÕã"É¿(×H•MBO _ìQÖ_z÷¹‡Mf]Ä”`SaÐË:¸ú8¤Ä+M2Y°JÍ^ƒñnÚw¾Öµ þÝ°;üFœ÷S4~‚¸é9¹ •Sc˜é(»Ì3
ø—/‰&EL'U#h Ò¤›^Š
¶$`ÚæÏ°­ý´Ë_\d°}ì­ÆRæDr‡m êиXn…ìÏ«*0M‡`ýçéµ’æÈbsÔ®´/t1ñB /x8«w©=q.C®«
••‹õ U!+kw4äëXÔFH<çócšÔ€B#SãO²]‚‡ ÇJ+ ƺû’ÿf$ ¿é»‰7›µáv¢]®´—¾&BGèI:yÛ‘ÕɦIÅ4óÖº:k·Ãª&bÎ)QZ/£&»KÀ¯œÆñj C qak‡Ÿœ§¥%nR¨Ý g™½9ù|¦Nö–qZ´A2bYÝk¡í8Ò ‘ûBúÚI¾:‡=7õ(Û+b™å ñqZb£Ø]À|ôñuö} iLÍ/»bÄswitch/config.json 0000644 0000000 0000041 00000074103 14037344371 014112 0 ustar admin www-data {
"Config_Version": "42",
"IPv4_Mode": "Static",
"IPv4_Address": "192.168.1.20",


I had assumed this was some binary encoded data between two sections of config.

User avatar
Stephen
Employee
Employee
 
Posts: 1033
Joined: Sun Dec 24, 2017 8:56 pm
Has thanked: 85 times
Been thanked: 181 times

Re: Potential bugs in WS3 firmware causing strange behaviour

Tue Jun 15, 2021 11:40 am

Since there were already issue's during the defaulting process, it's not surprising that the config restore is having issue's in this case.
The data we see here is actually relatively normal for a config backup but for me to be certain could you send me a copy of the full .ncb2 file?

Garnet
Member
 
Posts: 71
Joined: Wed Jun 02, 2021 9:29 am
Has thanked: 2 times
Been thanked: 1 time

Re: Potential bugs in WS3 firmware causing strange behaviour

Tue Jun 15, 2021 4:12 pm

I've PM'd you the config file. We have successfully factory reset the switch (as far as is visible to me) a few times now (holding DEF during boot). Assuming there is no physical damage to the flash memory holding the defaults there should be no issue with config restores unless there is a bug in the firmware, correct?

Garnet
Member
 
Posts: 71
Joined: Wed Jun 02, 2021 9:29 am
Has thanked: 2 times
Been thanked: 1 time

Re: Potential bugs in WS3 firmware causing strange behaviour

Wed Jun 30, 2021 2:51 pm

Can I get an update on this. We've been holding off on network upgrades for weeks now because the firmware is unstable.

PreviousNext
Return to Hardware and software issues

Who is online

Users browsing this forum: j2840fl and 51 guests