I wish I was smarter (LAG vs STP)

DOWNLOAD THE LATEST FIRMWARE HERE
User avatar
cwachs
Experienced Member
 
Posts: 115
Joined: Fri Nov 06, 2015 9:04 pm
Location: Colorado
Has thanked: 2 times
Been thanked: 10 times

I wish I was smarter (LAG vs STP)

Wed Apr 05, 2017 8:34 pm

After having a switch go bonkers on me today (I'm sure self inflicted), I need to get a better understanding of why and I say this humbly since it is my own doing.

I have a WS-8-150 DC at the top of our tower powering 3 ePMP APs. It is connected to the bottom of the tower (a WS-12-250 AC) via 2 fiber pairs (for redundancy).

On the top of the tower, they are into SFPs in ports 7 & 8. At the bottom, they are in SPFs in ports 13 & 14.

I thought I had STP set up where if one fiber line failed for whatever reason, the other would take over. Today, I got a bunch of STP errors in my log and suddenly no traffic was being routed up the tower. Again, this is my doing.

I have attached screen shots of LAG and STP of both switches. It's not right but I don't know why. And, should I use LAG over STP for simple double pair redundancy up a tower??

Top of the tower:

Screen Shot 2017-04-05 at 5.50.41 PM.png


Screen Shot 2017-04-05 at 6.31.49 PM.png


Bottom of the tower:

Screen Shot 2017-04-05 at 5.58.52 PM.png


Screen Shot 2017-04-05 at 5.59.02 PM.png


I'm much better at reverse engineering a correct setup...

User avatar
mike99
Associate
Associate
 
Posts: 837
Joined: Tue Nov 25, 2014 10:53 am
Location: Quebec, Canada
Has thanked: 95 times
Been thanked: 245 times

Re: I wish I was smarter (LAG vs STP)

Thu Apr 06, 2017 10:16 am

Lag = load balancing + fail over between 2 device

STP = redundant links over multiple devices

If it's for fail over between 2 devices, I would use LAG. Can't currently check config since now on my phone but maybe afternoon of no other answer come before.

User avatar
cwachs
Experienced Member
 
Posts: 115
Joined: Fri Nov 06, 2015 9:04 pm
Location: Colorado
Has thanked: 2 times
Been thanked: 10 times

Re: I wish I was smarter (LAG vs STP)

Thu Apr 06, 2017 10:22 am

I have moved that redundant fiber connection to a LAG. Working fine right now. It looks like my problem yesterday started with this:

Apr 5 15:02:53 STP: MSTI0: New root on port 3, root path cost is 40000

Port 3 is an AP and it suddenly became root and right after that, all h*ll broke loose on that switch:

Apr 5 15:08:05 STP: MSTI0: New root on port 3, root path cost is 20000
Apr 5 15:08:05 STP: MSTI0: New root on port 26, root path cost is 80000
Apr 5 15:08:05 STP: set port 7 to discarding
Apr 5 15:08:05 STP: set port 2 to discarding
Apr 5 15:08:05 STP: set port 3 to discarding
Apr 5 15:08:05 STP: MSTI0: New root on port 26, root path cost is 20000
Apr 5 15:08:05 STP: MSTI0: New root on port 25, root path cost is 20000
Apr 5 15:08:05 STP: set port 8 to discarding
Apr 5 15:08:05 STP: set port 7 to learning
Apr 5 15:08:05 STP: set port 7 to forwarding
Apr 5 15:08:05 STP: MSTI0: New root on port 26, root path cost is 120000
Apr 5 15:08:05 STP: set port 8 to learning
Apr 5 15:08:05 STP: set port 8 to forwarding
Apr 5 15:08:05 STP: MSTI0: New root on port 26, root path cost is 20000
Apr 5 15:08:05 STP: set port 7 to discarding
Apr 5 15:08:05 STP: MSTI0: New root on port 25, root path cost is 20000

This is an 8 port switch so I am guessing port 25 and 26 mean something else. Port 7 and 8 are the fiber lines feeding the switch from the bottom of the tower (from another Netonix). They were not in a LAG yesterday. As soon as I shut down one of those fiber lines from the bottom switch, I regained control of the top switch. Though port 3 (the AP) is still root - which seems not right.

User avatar
cwachs
Experienced Member
 
Posts: 115
Joined: Fri Nov 06, 2015 9:04 pm
Location: Colorado
Has thanked: 2 times
Been thanked: 10 times

Re: I wish I was smarter (LAG vs STP)

Thu Apr 06, 2017 10:50 am

Well, as I dig deeper, at the same exact time of day (15:02), a few of my switches at different points in the network all got a new root... We are a hub and spoke network so the only connection point between these different switches is our NOC.

User avatar
mike99
Associate
Associate
 
Posts: 837
Joined: Tue Nov 25, 2014 10:53 am
Location: Quebec, Canada
Has thanked: 95 times
Been thanked: 245 times

Re: I wish I was smarter (LAG vs STP)

Thu Apr 06, 2017 3:06 pm

25 and 26 are SFP ports. It could be that the core use is always the 26 ports or maybe for making programing easier (SFP 25 and 26 for every model so less condtions on programming).

Port 3 AP is to connect customer or backhaul ? If for customers, disable STP on this port else customer device could mess with STP on your network. STP should be enable only on trusted port you control all devices connected to. Also disable SFP on wireless link like in ubnt radio, it will probably only cause you pain.

User avatar
cwachs
Experienced Member
 
Posts: 115
Joined: Fri Nov 06, 2015 9:04 pm
Location: Colorado
Has thanked: 2 times
Been thanked: 10 times

Re: I wish I was smarter (LAG vs STP)

Thu Apr 06, 2017 3:35 pm

Port 3 is an AP with customers on the other end of it.

Life just got worse for me. I turned off the POE power to the AP that was on port 3 (which was set as root). I needed to change out the AP due to a bad GPS chip. As soon as I powered down that AP, I lost all access to the switch. It vanished off the network. I have power cycled it 3 times and it does not come back up on the network. The switch at the bottom of the tower sends packets up the fiber but nothing comes back down.

Since this is a DC powered switch at the top of the tower, it is not easy to trouble shoot. Luckily, I never took down my old Ethernet lines that run up the tower so I got 2 of my 3 APs back online on copper.

Can this be related to STP or is my switch got something else going on? And I thought way back when I watched "WISP Switch the Movie", STP was advised to be on all ports, including customer facing APs...

User avatar
cwachs
Experienced Member
 
Posts: 115
Joined: Fri Nov 06, 2015 9:04 pm
Location: Colorado
Has thanked: 2 times
Been thanked: 10 times

Lost access to a switch - STP related?

Thu Apr 06, 2017 4:26 pm

Moving topic https://forum.netonix.com/viewtopic.php?f=6&t=2656&p=18546#p18546to this board now that I have a failure and am in need of some support advice.

This is a DC switch on the top of a tower powering 3 ePMP APs. It is connected to a switch at the bottom over 2 fiber paths (in a LAG). We were battling some apparent STP issues yesterday and today. During that time, port 3 (an ePMP AP) became designated as "ROOT" on the switch. We powered down the POE to that AP on port 3. As soon as we did that, we lost all access to the switch.

We have power cycled the switch a couple times. The switch at the bottom shows link for both fiber ports in the LAG and it is sending packets up the fiber but nothing is coming back down. The management IP of the switch is static. Nothing shows up in the MAC table for the fiber ports at the bottom. Switch on the tower is running 1.4.7rc14 and is a WS-8-150-DC.

Question 1: Is there any way to regain control of this switch short of a hard reset?

Question 2: STP related? The fact port 3 got designated as ROOT and then we powered down that port cause this?

Question 3: Should STP be enabled on ports connected to APs (PtMP APs) serving customers?

From what I know about STP, when the switch reboots, it should determine patch cost and roles so rebooting it should shake it free from port 3 thinking it is root or the fiber LAG as being NDP - which is the state it appears to be in?

User avatar
sirhc
Employee
Employee
 
Posts: 7416
Joined: Tue Apr 08, 2014 3:48 pm
Location: Lancaster, PA
Has thanked: 1608 times
Been thanked: 1325 times

Re: I wish I was smarter (LAG vs STP)

Thu Apr 06, 2017 4:53 pm

So first off I am not sure why you did not play with LACP in the LAB before implementing it in service???

Anyway your screenshots are NOT correct

You have the LACP Key set at the TOP correctly but the LACP ports are NOT ENABLED?????
TOP LACP.png


You do NOT have the LACP Key set at the BOTTOM at all and the LACP ports are NOT ENABLED?????
Bottom LACP.png


Might I suggest you read up on LACP and then play with 2 units in a LAB environment before implementing LIVE?
Support is handled on the Forums not in Emails and PMs.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.

User avatar
sirhc
Employee
Employee
 
Posts: 7416
Joined: Tue Apr 08, 2014 3:48 pm
Location: Lancaster, PA
Has thanked: 1608 times
Been thanked: 1325 times

Re: I wish I was smarter (LAG vs STP)

Thu Apr 06, 2017 5:01 pm

cwachs wrote:Moving topic https://forum.netonix.com/viewtopic.php?f=6&t=2656&p=18546#p18546to this board now that I have a failure and am in need of some support advice.

This is a DC switch on the top of a tower powering 3 ePMP APs. It is connected to a switch at the bottom over 2 fiber paths (in a LAG). We were battling some apparent STP issues yesterday and today. During that time, port 3 (an ePMP AP) became designated as "ROOT" on the switch. We powered down the POE to that AP on port 3. As soon as we did that, we lost all access to the switch.

We have power cycled the switch a couple times. The switch at the bottom shows link for both fiber ports in the LAG and it is sending packets up the fiber but nothing is coming back down. The management IP of the switch is static. Nothing shows up in the MAC table for the fiber ports at the bottom. Switch on the tower is running 1.4.7rc14 and is a WS-8-150-DC.

Question 1: Is there any way to regain control of this switch short of a hard reset?

Question 2: STP related? The fact port 3 got designated as ROOT and then we powered down that port cause this?

Question 3: Should STP be enabled on ports connected to APs (PtMP APs) serving customers?

From what I know about STP, when the switch reboots, it should determine patch cost and roles so rebooting it should shake it free from port 3 thinking it is root or the fiber LAG as being NDP - which is the state it appears to be in?

As far as gaining control of switch should be simple, use ONE of the fibers by unplugging the other.

You may need to power cycle them?

Your main problem occurred because your did not have LACP setup at all.

You only specified the Key on one end and you failed to enable LACP ports on both switches.

At this point you have a LOOP which RSTP was trying to deal with the loop but if all your switches have RSTP enabled then things would shift around when you unplug things and depending on your RSTP settings a new Root may be established.
Support is handled on the Forums not in Emails and PMs.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.

User avatar
cwachs
Experienced Member
 
Posts: 115
Joined: Fri Nov 06, 2015 9:04 pm
Location: Colorado
Has thanked: 2 times
Been thanked: 10 times

Re: I wish I was smarter (LAG vs STP)

Thu Apr 06, 2017 5:12 pm

We did have LACP setup and working - after those screen shots were sent. Key was 10 on both ends using an active LACP. Both ports on top and bottom were enabled. We tested it by dropping a fiber and the LACP functioned as it should. That was all post screen shot where we were not using LACP.

About 12 hours after putting the LACP into action, we lost the switch when we powered down a POE port at the top of the tower. That same POE port had been designated ROOT even though it was attached to an AP serving customers. None of the customers below it can have DHCP traffic coming upstream - or radios do not allow that.

We have power cycled the radio a couple times. I have turned off both of the fiber ports separately. We tried turning off the LACP at the bottom and shutting off one of the fiber ports. Nothing gets any packets to return from the top of the tower.

I have TFTP auto backup enabled so I have a copy of the latest working config just before and just after we turned off the POE power at the top switch.

Next
Return to Hardware and software issues

Who is online

Users browsing this forum: No registered users and 71 guests