David Ramsden

Cisco VSS, domain ID and virtual MAC addresses

The other weekend I connected a L2 circuit between two sites. At both ends were Cisco 6500 Catalyst switches running VSS. The interfaces they connected to were configured as L3 and EIGRP was run between the two sites to share routes. But as soon as they were connected the neighbors started flapping.

Troubleshooting started and as always you start at the lowest OSI layer and work up. Bingo! The issue was at Layer 2 as I could see ARP was incomplete on both sides for the neighbor addresses. Checking the MAC address for the interface the L2 circuit was connected to at site A and the MAC address for the interface the L2 circuit was connected to at site B showed the same MAC. How could this happen?

As mentioned in the first sentence both ends had a Cisco 6500 Catalyst switches running VSS. One of the first things you do when configuring VSS so set the switch virtual domain ID. Cisco recommend that you enable virtual MAC addresses (mac-address use-virtual) under the switch virtual domain. I'll explain why Cisco recommend this option. When when the first switch comes up, VSS uses the MAC address pool from that member and uses that pool across all L3 interfaces. This MAC address pool is maintained by VSS when one (and only one) switch is reloaded. But if the entire VSS is reloaded and the other switch happens to come up first the MAC address pool will change. This shouldn't be a huge deal but if there are any other devices out there that are ignoring gratuitous ARP they will require manual intervention to get them working which will cause further service disruption.

Hence Cisco recommend using mac-address use-virtual under the switch virtual domain ID. This ensures the same MAC address pool is used at all times. No exceptions. But the switch virtual domain ID is significant in determining the virtual MAC address pool. It's used in the formula to calculate this pool. As per the Cisco documentation:

The MAC address range reserved for the VSS is derived from a reserved pool of addresses with the domain ID encoded in the leading 6 bits of the last octet and trailing 2 bits of the previous octet of the mac-address. The last two bits of the first octet is allocated for protocol mac-address which is derived by adding the protocol ID (0 to 3) to the router MAC address.

When I checked both switches I found they both had a switch virtual domain ID of 10. Therefore the virtual MAC address on the L3 interfaces were both 0008.e3ff.fc28. We can use the formula to check this:

6th octet (28) to binary: 00101000
Remove trailing 2 bits: 001010
001010 (bin) to decimal: 10

But what are the options for fixing the problem where the MAC addresses are the same on both sides?

  1. On one side, under the L3 interface use mac-address H.H.H.H
  2. Change the switch virtual domain ID on one VSS - Possible to do but requires a complete outage as a VSS reload is required.
  3. Remove mac-address use-virtual from the switch virtual domain ID - Not recommended as discussed previously.

Option 1 seems like the most viable option but how do you guarantee the MAC address you manually assign is unique? Will there be issues in the future? If we pick an arbitrary number between 1 and 255 (switch virtual domain ?) we can then use the formula to calculate a "safe" MAC address as long as no one in the future connects a VSS with this arbitrary number as the switch virtual domain ID. I decided to choose 99.

Virtual MAC address with domain ID 10: 0008.e3ff.fc28
Last two octets (fc28) hex to binary: 1111110000101000
99 (dec) to binary: 01100011
Insert 01100011 after leading 6 bits and before trailing 2 bits: 1111110110001100
1111110110001100 (bin) to hex: fd8c

Hence if a switch virtual domain ID of 99 was used the virtual MAC address assigned to a L3 interface would be 0008.e3ff.fd8c.

Problem solved. It was just unfortunate that the switch virtual domain ID happened to be the same. No one ever saw the two sites needing to be connected this way. If you're deploying VSS in your organisation, work smart and use unique switch virtual domain IDs everywhere. If you happen to connect to a 3rd party first check if they're using VSS and if they are, check what they have as their switch virtual domain ID. If there's a conflict someone will need to manually set a MAC address on an interface.


Automating Cisco switch swap outs

So you can't automate the entire process unfortunately. You're still going to need to pull a late night and get your hands dirty...

Recently I was tasked with swapping out 4 old Cisco 10/100Mb switches with new 10/100/1000Mb switches. The old switches were a combination of Cisco 3560, 2950 and 3548 series. The old switches also had some old configurations that needed to be updated and the interface configurations weren't consistent. The interfaces also had different VLAN configurations and this was my main concern. What if I made a mistake? It's not very realistic to chase 192 ports and make sure every single one was working as expected.

Going back to a previous blog, should network engineers be programmers, writing a script to speed up the configuration process and eliminate any mistakes was the answer.

The process would be:

  1. Get the existing IOS configuration.
  2. Find the interfaces.
  3. Convert the old 10/100 ("FastEthernet") interfaces to 10/100/100 ("GigabitEthernet") interfaces.
  4. Extract the important parts of the interface configuration (description, speed/duplex if set, VLAN configuration, trunk configuration etc).
  5. Pump out the converted interfaces.
  6. Copy/paste in to a template after manually making sure it looks good.
  7. Upload the configuration to the startup-config of the new switch.
  8. Swap the switch out.

First of all, the existing IOS configurations are stored in SVN. They get here via RANCID. So I had the old configurations easily to hand. Also if anything did happen to go wrong on the night, this served as a good reference point.

My weapon of choice for this scripting task was Perl. The script I wrote took in an existing IOS configuration and extracted the physical interfaces and SVIs, converted them from FastEthernet to GigabitEthernet and grabbed all of the important stuff such as description, speed/duplex, VLAN configuration, trunk configuration, IP configuration it's an SVI etc.

Here it is:

I ran this against each of the 4 existing switch IOS configurations, checked the output quickly and then copied/pasted the interfaces in to my switch template that has everything else set such as Spanning Tree configuration, errdisable recovery, NTP, AAA, SNMP, syslog etc etc. VTP would take care of the VLAN database once the switch was connected to the core.

I connected each of the new switches to my laptop and configured the Vlan1 SVI with a temporary IP, then TFTP'd the switch template to the startup-config along with the IOS version required. Turn the switch off and on again to make sure it looks good and job done on the configuration front. Each switch took a very small amount of time to configure and I could be safe in the knowledge that all the interfaces were correct.

The 4 switches were swapped out in the dead of night. It took 4 hours from start to finish, including testing and monitoring. Out of roughly 192 ports there was only one device which didn't work the next morning and that was due to an auto-negotiation issue.In my book it was a very successful, painless and efficient change. One which I wasn't particularly looking forward to but thanks to a bit of scripting ended up being easy.