The 8/15 issue, take 2
Posted: Thu Jun 09, 2016 8:35 pm
This post is meant to focus on a similarity in 3 other threads which (as I see it) has not been paid attention to. These 3 threads are:
1) http://forum.netonix.com/viewtopic.php?f=17&t=1022#p7949
Admittedly, this thread was flawed at start because I had first seen the 8/15 issue occur between two Netonix switches connected by a LAG and so I thought the LAG was the cause. The link refers to a subsequent posting where I report on lab setup where the 8/15 stream occurred between two Netonix switches connected by a single cable, no Airfiber or other devices connected.
2) http://forum.netonix.com/viewtopic.php?f=17&t=1390
User swang2002 reports on an issue between "the 8 port AC model with a Mini 6 port". Admittedly, he says it is 15 Mbps (instead of 8) but this could be due to looking at total throughput or by reversing Mbps and Kpps. Other than that, his findings are quite similar to what I had seen and so I left a comment in his thread. The thread stopped then, but it would be interesting to hear more from swang2002 what he did to solve the issue long term.
3) http://forum.netonix.com/viewtopic.php?f=17&t=1654
and http://forum.netonix.com/viewtopic.php?f=17&t=1654&start=10#p12374
TheHox reported about problems which, in his first post, weren't specifically pointing to the 8/15 issue. Actually his post was about ports flapping. In his second posting, he mentioned the 8/15 thing as a second issue he had observed and he showed in a screenshot what the 8/15 issue is about: a stream of 8 Mbps with 15 Kpps of something. At that point Wistech and Adairw became aware that they had seen the 8/15 issue too, and an Airfiber was mentioned for the first time.
Later, thread #3 zeroed in on the AF as the source of FC pause frames, it was confirmed that turning off FC will prevent the 8/15 issue, and (therefore) FC storm protection was added to Netonix firmware.
What I feel bad about is that, while thread #3 was growing, an important observation got lost: The 8/15 issue can happen without any AF and it can happen between two Netonix switches (as reported in thread #1 and #2). If I remember well, TheHox emphasized once more that no AF was present in his setup, but that didn't help.
Ignoring this important fact has been a serious mistake - that's what I think and why I opened this new thread.
Furthermore, it was quite unfortunate that, at about the same time, I uncovered a bug on the AF5X that occurs if it is run in 1/4x SISO modulation: In that condition, the AF5X sends 1.25 million(!) pps of FC pause frames to a Netonix switch. This finding was then taken to be the ultimate proof that the AF is the cause behind the issues discussed in thread #3.
However, the AF5X bug is rather different: It is about 640 Mbps and 1.25 Mpps, not 8 Mbps and 15 Kpps. And it also tells us that a Netonix will survive this maximum rate of pause frames (as demonstrated in my video) without becoming inaccessible and still passing packets to the storm source. Which raises the question why an 8 Mbps 15 Kpps stream of something (assumed to be pause frames) will render it inaccessible and block traffic going through it when a 1.25 Mpps true FC storm will not.
There has been no answer to this question. Instead, as it appears to me, FC storm protection has become a self-fulfilling prophecy: It will trigger at 10 Kpps (over 5 seconds) and if it triggers for an AF24 in bad weather, the AF is blamed for abnormal behavior. But are we sure 10 Kpps of FC pause frames are abnormal in that situation?
In short: I'd like to see us step back and think through it again: We got evidence for the 8/15 issue without AF. Turning of FC will prevent that. A Netonix will survive a 1.25 Mpps FC storm so how can there be a problem at 10 or 15 Kpps? And: Are we sure that 10 Kpps of FC pause frames are abnormal behavior?
1) http://forum.netonix.com/viewtopic.php?f=17&t=1022#p7949
Admittedly, this thread was flawed at start because I had first seen the 8/15 issue occur between two Netonix switches connected by a LAG and so I thought the LAG was the cause. The link refers to a subsequent posting where I report on lab setup where the 8/15 stream occurred between two Netonix switches connected by a single cable, no Airfiber or other devices connected.
2) http://forum.netonix.com/viewtopic.php?f=17&t=1390
User swang2002 reports on an issue between "the 8 port AC model with a Mini 6 port". Admittedly, he says it is 15 Mbps (instead of 8) but this could be due to looking at total throughput or by reversing Mbps and Kpps. Other than that, his findings are quite similar to what I had seen and so I left a comment in his thread. The thread stopped then, but it would be interesting to hear more from swang2002 what he did to solve the issue long term.
3) http://forum.netonix.com/viewtopic.php?f=17&t=1654
and http://forum.netonix.com/viewtopic.php?f=17&t=1654&start=10#p12374
TheHox reported about problems which, in his first post, weren't specifically pointing to the 8/15 issue. Actually his post was about ports flapping. In his second posting, he mentioned the 8/15 thing as a second issue he had observed and he showed in a screenshot what the 8/15 issue is about: a stream of 8 Mbps with 15 Kpps of something. At that point Wistech and Adairw became aware that they had seen the 8/15 issue too, and an Airfiber was mentioned for the first time.
Later, thread #3 zeroed in on the AF as the source of FC pause frames, it was confirmed that turning off FC will prevent the 8/15 issue, and (therefore) FC storm protection was added to Netonix firmware.
What I feel bad about is that, while thread #3 was growing, an important observation got lost: The 8/15 issue can happen without any AF and it can happen between two Netonix switches (as reported in thread #1 and #2). If I remember well, TheHox emphasized once more that no AF was present in his setup, but that didn't help.
Ignoring this important fact has been a serious mistake - that's what I think and why I opened this new thread.
Furthermore, it was quite unfortunate that, at about the same time, I uncovered a bug on the AF5X that occurs if it is run in 1/4x SISO modulation: In that condition, the AF5X sends 1.25 million(!) pps of FC pause frames to a Netonix switch. This finding was then taken to be the ultimate proof that the AF is the cause behind the issues discussed in thread #3.
However, the AF5X bug is rather different: It is about 640 Mbps and 1.25 Mpps, not 8 Mbps and 15 Kpps. And it also tells us that a Netonix will survive this maximum rate of pause frames (as demonstrated in my video) without becoming inaccessible and still passing packets to the storm source. Which raises the question why an 8 Mbps 15 Kpps stream of something (assumed to be pause frames) will render it inaccessible and block traffic going through it when a 1.25 Mpps true FC storm will not.
There has been no answer to this question. Instead, as it appears to me, FC storm protection has become a self-fulfilling prophecy: It will trigger at 10 Kpps (over 5 seconds) and if it triggers for an AF24 in bad weather, the AF is blamed for abnormal behavior. But are we sure 10 Kpps of FC pause frames are abnormal in that situation?
In short: I'd like to see us step back and think through it again: We got evidence for the 8/15 issue without AF. Turning of FC will prevent that. A Netonix will survive a 1.25 Mpps FC storm so how can there be a problem at 10 or 15 Kpps? And: Are we sure that 10 Kpps of FC pause frames are abnormal behavior?