A brilliant solution for providing high availability in a small RADIUS server/ISE deployment After months of issues, they have finally restored my access to my blog! After such a hiatus, it is my pleasure to bring this particular post. I’m certain many will find it at the very least cool in an “I’m a network geek” kind of a way, or even better: you will find it very educational and even leverage it in your own world. This is a solution I have been wanting to write about for a long time now, and let’s be clear—it is not mine. This entire post is owed to a long-time personal friend of mine who is also one of the most talented and gifted technologists roaming the earth today. His name is Epaminondas Peter Karelis, CCIE #8068 (Pete). Pete designed this particular high-availability solution for a small ISE deployment that had two data centers, as is crudely illustrated by me in the below figure. Aaron T. Woland The 2 DC architecture and IP SLA I have often used Anycast in my Identity Service Engine (ISE) deployments. It’s a terrific tool in the security toolbox to help ensure traffic goes to one place—the correct place, the closest place—and has a backup if that closer place is not available. However, this particular use of Anycast was something I never considered before. For those of you who may not be network heads, Anycast is a networking technique where the exact-same IP address exists in multiple places within the network. In this case, the same IP address (2.2.2.2) is assigned to the Gig1 interfaces on all of the RADIUS servers (ISE PSNs in our case). The router in each data center is configured with a static route to 2.2.2.2/32 with the Gig0 IP address of the PSN as the next hop. Those static routes are redistributed into the routing protocol; in this case, EIGRP is used. Anycast relies on the routing protocols to ensure that traffic destined to the Anycast address (2.2.2.2) is sent to the closest instance of that IP address. Now that Anycast is setup to route 2.2.2.2 to the ISE PSN, Pete used EIGRP metrics to ensure that the preferred route pointed at the primary data center, while the route to the secondary data center is listed as the feasible successor (FS). With EIGRP, there is a sub-second delay when a route (known as the successor) is replaced with the backup route (known as the feasible successor). How do we make the successor route drop from the routing table when the ISE node goes down? Pete configured an IP service-level agreement (IP SLA) on the router that checked the status of the HTTP service on the ISE PSN in the data center every five seconds. If the HTTP service stops responding on the active PSN, then the route is removed and the feasible successor takes over, causing all the traffic for 2.2.2.2 to be sent to the PSN in the secondary data center. The below figure illustrates the IP SLA function. And when it occurs, the only route left in the routing table is to the router at the secondary data center. Aaron T. Woland The IP SLA causing the routing table to change All network devices are configured to use the Anycast address (2.2.2.2) as the only RADIUS server in their configuration. The RADIUS requests will always be sent to whichever ISE node is active. Example 1 below shows the interface configuration on the ISE PSN. The Gig0 interface is the actual routable IP address of the PSN, while Gig1 is in a VLAN to nowhere using the Anycast IP address. Example 1 — ISE Interface Configuration Interface gig 0 !Actual IP of Node ip address 1.1.1.1 255.255.255.0 interface gig 1 !Anycast VIP assigned to all PSN nodes on G1 ip address 2.2.2.2 255.255.255.255 ip default-gateway [Real Gateway for Gig0] !note no static routes needed. Example 2 shows the IP SLA configuration on the router, to test port 80 on the PSN every five seconds but to timeout after 1000 msec. When that timeout occurs, the router will be removed. Example 2 — IP SLA Configuration ip sla 1 !Test TCP to port 80 to the actual IP of the node. !"control disable" is necessary, since you are connecting !to a host instead of an SLA responder tcp-connect 1.1.1.1 80 control disable ! Consider the SLA as down if response gt 1000msec threshold 1000 ! Timeout after 1000 msec. timeout 1000 !Test every 5 Seconds: frequency 5 ip sla schedule 1 life forever start-time now track 1 ip sla 1 ip route 2.2.2.2 255.255.255.255 1.1.1.1 track 1 Example 3 shows the route redistribution configuration where the EIGRP metrics are applied. Pete was able to use the metrics that he chose specifically because he was very familiar his network. His warning to others attempting the same thing is to be familiar with your network or to test thoroughly when identifying the metrics that would work for you. Example 3 — Route Redistribution router eigrp [Autonomous System Number] redistribute static route-map STATIC-TO-EIGRP route-map STATIC-TO-EIGRP permit 20 match ip address prefix-list ISE_VIP !Set metrics correctly set metric 1000000 1 255 1 1500 ip prefix-list ISE_VIP seq 5 permit 2.2.2.2/32 Well, that’s it! I hope you enjoyed this as much as I did seeing it go into production. As always, I look forward to reading your comments below. Related content opinion How does certificate-based authentication work? The same cryptographic techniques that help ensure secure connections to websites also allow client devices to securely login to corporate networks By Aaron Woland May 10, 2021 11 mins Mobile Security Network Security Data Center opinion Securing the modern mobile OS Researchers from the Talos intelligence group recently published some research about a malicious MDM server pwning some mobile devices. In this blog post, we discuss how these mobile endpoints leverage MDMs and how the mobile OS is secured, so that t By Aaron Woland Jul 31, 2018 14 mins Small and Medium Business Mobile Device Management Mobile Security opinion Protecting iOS against the aLTEr attacks The new aLTEr attack can be used against nearly all LTE connected endpoints by intercepting traffic and redirecting it to malicious websites. This article summarizes how the attack works, and suggests ways to protect yourself from it – includin By Aaron Woland Jul 10, 2018 5 mins Small and Medium Business Mobile Security Network Security opinion A first-hand account of Cisco Live 2018 in Orlando The Cisco Live experience – from the perspective of a long-term attendee and speaker. A peak behind the curtain, learning Cisco technology, culture, education, beer and even kilts! See the options that are available to you through the eyes of By Aaron Woland Jun 21, 2018 14 mins Networking PODCASTS VIDEOS RESOURCES EVENTS NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe