Page 1 of 1

Help! Unresponsive WS-12-250A

Posted: Mon Aug 03, 2015 8:18 am
by adairw
Chris and others,
I have a WS12 in my data center running mid-span for several Exalt G2 links, and AF24 and some Rocket M5's
Last night during a storm I lost access to the switch, no ping, no ssh, no http(s) but it's up and passing traffic...

I'm going to try and find a console cable this morning and hook up to it when I get to the office.
But I wanted to know if anyone else has has this problem and if there was any suggestions. Hoping maybe a reboot will bring it back. But obviously I'm only going to do that with a spare switch already setup and ready to takes it's place..

We had some heavy rain last night, laced with lots of lightening. A AF24 (backup link to an exalt exploreair now) went down (lost rf due to rain) 2 minutes after the WS became unresponsive. The AF24 came back but the WS didn't..

This would literally be my first WS problem in 10 months of operation. Hoping it's just a fluke.. :) I have a camera on the roof with all the radio's at the DC, might be interesting to see if there was a big lightening strike nearby or something.

Re: Help! Unresponsive WS-12-250A

Posted: Mon Aug 03, 2015 9:16 am
by sirhc
Hi Adair,

adairw wrote:I'm going to try and find a console cable this morning and hook up to it when I get to the office.

I would try a console cable as you suggested. This is why I keep telling people to have a console cable on hand and tested. It costs us over $5 MSRP to put that console port into the switch which people all complained they wanted yet few people actually use it. Or they wait until an OH shit moment then scramble to find one.

Better to have one in your bag ready to go.

A MUST HAVE
USB to Serial dongle
NULL Model DB9 serial cable
Putty or similar software emulator

adairw wrote:This would literally be my first WS problem in 10 months of operation. Hoping it's just a fluke.. :)

I would not call losing a switch in a storm a problem with the Switch per-say, maybe bad luck, as there is not much going to protect you from lightening. But most damage is not lightening related but rather the heavy rain.

The biggest issue is the rain gets in around the ground rods and changes the ground potential difference between the tower grounding system and the the service grounding system leaving the Ethernet cable to try and carry the current of the difference between them. This is why it is really important to bond the tower ground system to the electrical ground system with heavy gauge wire like #2. I also run #2 up to a ground bus near the antennas and then a #6 to each antenna mount.

Now you may have it bonded so this is just academia for everyone.

adairw wrote:Hoping maybe a reboot will bring it back. But obviously I'm only going to do that with a spare switch already setup and ready to takes it's place..

This is a smart move!!! I am glad to hear you have a SPARE!!! - *COUGH* Mike

adairw wrote:But I wanted to know if anyone else has has this problem and if there was any suggestions.

Losing your switch in a storm is not an issue or problem with the switch, it is damage. But with that said I have had a switch in a storm act funny that a reboot fixed, in fact this happened twice to me this year. We tracked it back to an issue with a long buried Ethernet cable that feeds the business next door. The Ethernet cable was connecting their Electric ground service to ours and the storm rolling through with heavy rain would cause a ground potential difference. Since bonding the 2 Electric services is out of the question we have decided to replace the cable with fiber which is schedule to happen soon.

Good Luck and let us know the out come.

Re: Help! Unresponsive WS-12-250A

Posted: Mon Aug 03, 2015 11:27 am
by adairw
Disaster averted.
Management of the switch was over a sfp copper ethernet module. When I got the office it wasn't linked up but everything else on the switch seemed fine.
I removed the module, plugged it in to another WS and it linked up on that one. I put the sfp back in the original switch and it linked back up no problem. I'll probably end up replacing the sfp module but may let it run awhile to see if something is wrong or if it was just a fluke.

Re the switch failing during a storm. It's in a 6 story commercial building, the run to the roof is about 150 but it's all in conduit and the run to the antennas are short, maybe 10-15 feet. Every antenna has an Transector ALPU up top and a WB-APC-Gige modules in the data center. Not your typical tower install by any means. Was really hoping that it would be something like what I found.

Re: Help! Unresponsive WS-12-250A

Posted: Mon Aug 03, 2015 11:37 am
by sirhc
You know Adair that is an issue I experienced as well. Re-seat the SFP module.

It is on my list to talk to Eric about so do not throw that SFP away just yet.

I think this started happening recently with a change to the SFP code to support MicroTik SFPs

As I said I will talk with Eric regarding this issue.

Re: Help! Unresponsive WS-12-250A

Posted: Mon Aug 03, 2015 11:39 am
by adairw
I'll have to go look at the module, but it's not a mikrotik sfp.
I will so, the mikrotik copper sfp's I've used work fine.
But I tried to get a mikrotik single mode fiber sfp to work the other day and it would not work.

I'm going to take a beating for this, but the switch I tried it in isn't up to date on firmware. I can update and test if that will help anything.

AW

Re: Help! Unresponsive WS-12-250A

Posted: Mon Aug 03, 2015 11:47 am
by sirhc
Yea, you should upgrade it, how old is the firmware?

What I am saying is at some point in the near past I have experienced situations where the SFP module needs to re-seated. This normally occurs after a storm so there would have been a power blip. Possibly when the electric flips to battery backup for a second and then back again the SFP modules needs pulled out and re-seated to function.

I believe this behavior was after Eric had made some changes to the SFP initialization routine to deal with MT SFP modules because prior to this I never saw it.

Coincidence or not I do not know, I had also changed the SFP modules during that time to another brand because I was getting errors on some of the old brand. THe new GENERIC brand eliminated these Errors but this is also when I started see this issue.

Last night you probably had an electric BLIP as well?