David Ramsden – Network engineer, general geek, petrol + drum and bass head
7Feb/15

Cisco VSS, domain ID and virtual MAC addresses

The other weekend I connected a L2 circuit between two sites. At both ends were Cisco 6500 Catalyst switches running VSS. The interfaces they connected to were configured as L3 and EIGRP was run between the two sites to share routes. But as soon as they were connected the neighbors started flapping.

Troubleshooting started and as always you start at the lowest OSI layer and work up. Bingo! The issue was at Layer 2 as I could see ARP was incomplete on both sides for the neighbor addresses. Checking the MAC address for the interface the L2 circuit was connected to at site A and the MAC address for the interface the L2 circuit was connected to at site B showed the same MAC. How could this happen?

As mentioned in the first sentence both ends had a Cisco 6500 Catalyst switches running VSS. One of the first things you do when configuring VSS so set the switch virtual domain ID. Cisco recommend that you enable virtual MAC addresses (mac-address use-virtual) under the switch virtual domain. I'll explain why Cisco recommend this option. When when the first switch comes up, VSS uses the MAC address pool from that member and uses that pool across all L3 interfaces. This MAC address pool is maintained by VSS when one (and only one) switch is reloaded. But if the entire VSS is reloaded and the other switch happens to come up first the MAC address pool will change. This shouldn't be a huge deal but if there are any other devices out there that are ignoring gratuitous ARP they will require manual intervention to get them working which will cause further service disruption.

Hence Cisco recommend using mac-address use-virtual under the switch virtual domain ID. This ensures the same MAC address pool is used at all times. No exceptions. But the switch virtual domain ID is significant in determining the virtual MAC address pool. It's used in the formula to calculate this pool. As per the Cisco documentation:

The MAC address range reserved for the VSS is derived from a reserved pool of addresses with the domain ID encoded in the leading 6 bits of the last octet and trailing 2 bits of the previous octet of the mac-address. The last two bits of the first octet is allocated for protocol mac-address which is derived by adding the protocol ID (0 to 3) to the router MAC address.

When I checked both switches I found they both had a switch virtual domain ID of 10. Therefore the virtual MAC address on the L3 interfaces were both 0008.e3ff.fc28. We can use the formula to check this:

6th octet (28) to binary: 00101000
Remove trailing 2 bits: 001010
001010 (bin) to decimal: 10

But what are the options for fixing the problem where the MAC addresses are the same on both sides?

  1. On one side, under the L3 interface use mac-address H.H.H.H
  2. Change the switch virtual domain ID on one VSS - Possible to do but requires a complete outage as a VSS reload is required.
  3. Remove mac-address use-virtual from the switch virtual domain ID - Not recommended as discussed previously.

Option 1 seems like the most viable option but how do you guarantee the MAC address you manually assign is unique? Will there be issues in the future? If we pick an arbitrary number between 1 and 255 (switch virtual domain ?) we can then use the formula to calculate a "safe" MAC address as long as no one in the future connects a VSS with this arbitrary number as the switch virtual domain ID. I decided to choose 99.

Virtual MAC address with domain ID 10: 0008.e3ff.fc28
Last two octets (fc28) hex to binary: 1111110000101000
99 (dec) to binary: 01100011
Insert 01100011 after leading 6 bits and before trailing 2 bits: 1111110110001100
1111110110001100 (bin) to hex: fd8c

Hence if a switch virtual domain ID of 99 was used the virtual MAC address assigned to a L3 interface would be 0008.e3ff.fd8c.

Problem solved. It was just unfortunate that the switch virtual domain ID happened to be the same. No one ever saw the two sites needing to be connected this way. If you're deploying VSS in your organisation, work smart and use unique switch virtual domain IDs everywhere. If you happen to connect to a 3rd party first check if they're using VSS and if they are, check what they have as their switch virtual domain ID. If there's a conflict someone will need to manually set a MAC address on an interface.

25Oct/14

Show me the config!

Back in the days before stacks and VSS, doing a "show run" to look at something was easy. You pressed space a few times to page through the output and found the section you were interested in. Now when you've got a stack of several switches or switch running VSS with hundreds of interfaces you'll be pressing space forever and a day paging through the config.

In IOS you can pipe output to a selection of commands to filter output. Much like redirecting output using a pipe in UNIX or even Powershell.

Say for example you want to see the static routes in the running-config:

As with the above, regular expressions are acceptable too!

This is all well and good for commands such as "ip route" or "ntp" for example but what if you want to check an EIGRP config?

Oh. This isn't what we were expecting. This is because the EIGRP config is a section. Thankfully you can pipe to the "section" command:

Much easier than "sh run" and then paging through the output until you find what you're after.

And finally, if you're in configuration mode and want to check something with show, ping etc and don't want to have to drop back to priv exec mode, prefix your command with "do":

 

13Oct/14

Cisco Crypto ACLs – Do they really need to match?

When starting out with IPsec tunnels it seems to be a common misconception that the crypto ACL, sometimes referred to as the encryption domain or the interesting traffic, must match 100% or be mirrored at both peers or the tunnel won't come up. This isn't strictly true. Whilst the ISAKMP phase 1 and IPsec phase 2 proposals must match, the crypto ACL can be different.

Assume that at the local peer traffic to be encrypted originates from 10.0.0.0/24 and is destined for 192.168.0.0/24. The crypto ACL would be:

But what about the following?

IPsec phase 2 can still be established even though the crypto ACL isn't mirrored at the local and remove peer. The local peer specifies 10.0.0.0/24 but the remote peer specifies 10.0.0.0/8. In this scenario IPsec phase 2 can only be initiated from the peer that has the larger subnet. This is true for both Cisco ASA and IOS.

And in the example above, in the local peer's ACL there's a deny ACE but none on the remote peer's ACL. In this scenario any traffic originating on the local peer from 10.0.0.0/24 destined to 192.168.0.200/32 won't traverse the tunnel. The device (ASA or IOS router) will look at the next crypto map in the sequence and try to match traffic there. If no crypto maps are found it'll flow unencrypted out of the egress interface.

Obviously be careful with mismatching subnets and using deny ACEs in the crypto ACL because you may end up with traffic trying to enter the wrong tunnel and other strange things happening.

 

Tagged as: , , , , , , No Comments
9Sep/14

Reviewing Cisco ASA firewall rules

Today I was reviewing a firewall rule set on a Cisco ASA firewall. The firewall has around 399 ACLs (Access Control Lists) comprising of 7272 ACEs (Access Control Entries). Quite a task! Unfortunately I didn't have any tools to hand such as Cisco Security Manager or something like FirePac to audit the rules and give me some suggestions.

Stage 1 was to visually look at the ACLs and spot the obvious mistakes and remove them. Stage 2 was to then remove any unused names, objects and object-groups. I hacked up a Perl script to do this. The script reads the complete ASA config, gets all the names, objects and object-groups then works out which ones aren't referenced anywhere else:

Stage 3 was to work out which ACLs could be completely removed and which ACLs should be reviewed in more detail. If an ACL with or without ACEs, has a total of 0 hits it can (probably) be removed. If an ACL with ACEs has less than or equal to 100 hits it should be reviewed in more detail because the chances are some of the ACEs associated with it can be removed. A quick and dirty Perl script did the trick:

I found 181 ACLs that can be immediately removed and a further 16 to be reviewed. With an average of 18 ACEs per ACL, that equates to 3258 ACEs that can removed and 288 that may be able to be removed after a review.

By the end of this journey I should have reduce the rule set by at least 44.80%. After that the rule set just needs re-ordering to optimise the processing.

Tagged as: , , , , , 6 Comments
2Aug/14

Automating mass Cisco IOS upgrades

This morning I needed to upgrade the IOS on 29 Cisco 3560G switches. Rather than login to each one, clean up the flash storage, FTP on the IOS image and set the boot image, I wrote a simple shell script and used clogin from RANCID to automate this task. Of course, nearly every Network Configuration Management platform that's any good should be able to do this but I prefer the personal touch.

The commands required on the switch were as follows:

First I tell IOS to not prompt on file operations. This makes automation easier as there's no need to deal with questions. Then I clean up the flash storage on the switch by removing any old IOS images. The IOS image is copied from an FTP server to the flash storage. The file prompt is put back to defaults and the boot system variable is set to the new IOS image. Finally the configuration is committed to NVRAM because at some point the switch will need to be reloaded.

The shell script will read in a list of IP addresses to connect to and then using clogin it'll login to each switch and execute the commands above.

The script I wrote is as follows:

A file called ips.txt has the list of IP addresses for the switches (one IP address per line). The commands listed above go in to a file called commands.txt. And lastly there's a file called clogin.txt that contains the login details that clogin needs. This would look like:

This tells clogin that there's no need to enter enable and to first try SSH and followed by telnet.

When the script is run it will grab the first IP address in ips.txt, execute clogin to login to the switch and then execute each command in commands.txt. When clogin exits, the IP address in ips.txt will be removed and placed in to a file called processed.txt. The script then prompts if it should continue to the next IP address, allowing you to review what happened to make sure the IOS image copied on OK.

This allowed me to upgrade 29 switches, whilst watching some morning TV and sipping a coffee with my feet up. All that's required now is a reload of each switch.

2Aug/14

Automating Cisco switch swap outs

So you can't automate the entire process unfortunately. You're still going to need to pull a late night and get your hands dirty...

Recently I was tasked with swapping out 4 old Cisco 10/100Mb switches with new 10/100/1000Mb switches. The old switches were a combination of Cisco 3560, 2950 and 3548 series. The old switches also had some old configurations that needed to be updated and the interface configurations weren't consistent. The interfaces also had different VLAN configurations and this was my main concern. What if I made a mistake? It's not very realistic to chase 192 ports and make sure every single one was working as expected.

Going back to a previous blog, should network engineers be programmers, writing a script to speed up the configuration process and eliminate any mistakes was the answer.

The process would be:

  1. Get the existing IOS configuration.
  2. Find the interfaces.
  3. Convert the old 10/100 ("FastEthernet") interfaces to 10/100/100 ("GigabitEthernet") interfaces.
  4. Extract the important parts of the interface configuration (description, speed/duplex if set, VLAN configuration, trunk configuration etc).
  5. Pump out the converted interfaces.
  6. Copy/paste in to a template after manually making sure it looks good.
  7. Upload the configuration to the startup-config of the new switch.
  8. Swap the switch out.

First of all, the existing IOS configurations are stored in SVN. They get here via RANCID. So I had the old configurations easily to hand. Also if anything did happen to go wrong on the night, this served as a good reference point.

My weapon of choice for this scripting task was Perl. The script I wrote took in an existing IOS configuration and extracted the physical interfaces and SVIs, converted them from FastEthernet to GigabitEthernet and grabbed all of the important stuff such as description, speed/duplex, VLAN configuration, trunk configuration, IP configuration it's an SVI etc.

Here it is:

I ran this against each of the 4 existing switch IOS configurations, checked the output quickly and then copied/pasted the interfaces in to my switch template that has everything else set such as Spanning Tree configuration, errdisable recovery, NTP, AAA, SNMP, syslog etc etc. VTP would take care of the VLAN database once the switch was connected to the core.

I connected each of the new switches to my laptop and configured the Vlan1 SVI with a temporary IP, then TFTP'd the switch template to the startup-config along with the IOS version required. Turn the switch off and on again to make sure it looks good and job done on the configuration front. Each switch took a very small amount of time to configure and I could be safe in the knowledge that all the interfaces were correct.

The 4 switches were swapped out in the dead of night. It took 4 hours from start to finish, including testing and monitoring. Out of roughly 192 ports there was only one device which didn't work the next morning and that was due to an auto-negotiation issue.In my book it was a very successful, painless and efficient change. One which I wasn't particularly looking forward to but thanks to a bit of scripting ended up being easy.

2Aug/14

Should network engineers be programmers?

Short answer: Yes.

Maybe not a programmer in the sense that you need to be proficient in C++, .NET, assembler, know UML etc but having some general programming knowledge is very useful. In my opinion and experience the most important programming skill to have is a fairly in-depth knowledge of a scripting language. Be that shell, Perl, Powershell or even batch scripts. A week doesn't go by where I don't write a script to help me with my day to day tasks. Either to automate a process or format some logs or debug output I've collected.

Personally my scripting language of choice is either shell or Perl. Shell for easy repetitive tasks and Perl for formatting data or even creating configurations. Here's a very simple example of a Perl script I wrote recently:

What does this do, apart from make my life simpler? It generates a Cisco IOS config with 29 LACP port channels and configures the physical interfaces. Then it's a case of running the script and copying/pasting the result in to the device. It also eliminates any human error. If you were having to create 29 port channels and configure 58 physical interfaces, the chances are you'll make a mistake. Such as forgetting to configure the interface as a trunk, setting the wrong channel group ID on the interface or generally getting in to a bit of a mess.

I'm going to post a few other blogs containing some scripts I've recently used to help automate tasks. Time to sharpen your scripting skills!

10Jun/12

Editing Cisco IOS ACLs

If you've administered Cisco PIX or ASA security appliances, you'll know how easy editing ACLs is. If you want to insert a new rule in to an existing ACL you can easily insert it where you want. For example:

 

This will insert the rule at position 12 of the outside_access_in ACL, pushing the existing rule at position 12 down to position 13 and re-ordering everything.

In Cisco IOS there's no obvious way to do this when working with ACLs. A lot of the time people will copy the ACL out, edit it in a text editor, "no access-list" the ACL and then paste in the modified ACL. Which does work but can be risky when working remotely as it's easy to lock yourself out of the IOS device. This can happen if you don't remove the ACL from interfaces before deleting the ACL.

But you can edit ACLs on the IOS device itself when using an extended ACL. Lets create an ACL:

 

If you view this ACL you'll notice line numbers:

 

Lets say you need to add another rule before the deny (sequence number 20). Enter the extended ACL:

 

Now you can insert a new rule by specifying a sequence number less than the deny rule (which is sequence 20):

 

If you view the ACL you'll see the new rule:

 

What you can now do is resequence the ACL so that all the sequence numbers are sequential. For example if you wanted the sequence numbers to start at 10 and go up in increments of 10:

 

If you want to delete a specific rule:

 

Tagged as: , No Comments