So we got an RMA’d Watchguard unit in the other day and I finally got time to install it last night. The first Watchguard unit started having issues last night and I figured that was a perfect time to replace the old unit with the new. It was right at that moment we started having issues. Earlier in the week I rebuilt the firewall ruleset, proxy configurations and static NAT translations to mimic exactly what the current Watchguard was doing. As any company does, this client has an email server, a few websites, and other inbound and outbound ports that need to be open.
The original Watchguard was being replaced with a better one that is able to handle more connections and users. The old one was having issues last night around 4:15 and I made a decision on the fly to install the new Watchguard seeing as how it was close to the end of the day already. After installing the new Watchguard, it took a little bit for the Internet connection to come back up.
There is an MPLS connection that was down because it was having a conflict with the MAC Address on the new Watchguard. The Watchguard was showing a different MAC Address than the ISP’s AdTran‘s were used to seeing. For those of you who dont know, the ARP Table is a long list of Hardware Addresses (every single device ever made, wireless or Ethernet enabled, has a unique hardware address called a MAC Address (Media Aceess Control). Manufacturers are given a specific amount of these addresses and they are required to “code” the network cards with different “unique” addresses for every unit that is shipped out the door). So basically, all the computers, network devices, printers, etc… are listed in the ARP Tables for the Watchguard and the AdTran as well as many other computers and equipment.
Because the old Watchguard has a different hardware address than the new one (based on what I said in the above paragraph), the AdTran got confused and didn’t know what to do with the MPLS (inter-office) network traffic because the ARP Tables were different (think of it as the two devices had different information and were arguing about who was right). I suspected that this may have been an issue last night and rebooted all the network equipment to clear the ARP Tables, but the AdTran’s all have a security mechanism built in so that an attacker (someone with malicious intentions) can’t “poison” the ARP tables.
It was this security mechanism that also protects the ARP table from being cleared by rebooting the device. That is why I needed to call our ISP and have them manually clear the ARP tables on the AdTran’s. This security mechanism is part of the Managed Services that we have contracted with our ISP to provide, and this “test” proved that the AdTran is doing what it is supposed to do. The issue is that we were unaware that the AdTrans were setup this way.
After the ARP tables were cleared the MPLS came right back up and running, the phones started working again and everything went back to normal. Moral of the story: Make sure you clear your ARP Tables.
var _gaq = _gaq || ; _gaq.push(['_setAccount', 'UA-37302584-1']); _gaq.push(['_trackPageview']);