I read once about how to do load-balancing when using Route-Reflectors (RR) in a MPLS L3VPN network. It is a insteresting topic because RRs only reflect the best prefixes to the its clients. So how we make the RR to send more than one?
So I built a GNS3 lab to work on this subject:
https://github.com/thomarite/mpls-rr
This is our scenario:
- We have one customer vrf “CUST-A” with three locations: TY, LD and NY.
- We are using BGP for PE-CE routing. Each site will use a different private ASN. Our SP is ASN 100.
- TY has two connection to our SP so we want to make use of both of them.
- We have a RR SP2 that is in line. So we need a full-mesh iBGP from all PE to SP2.
- Our SP IGP is OSFP.
- The goal is to make all other PE connected to CUST-A sites to be able to load-balance to TY site prefixes 192.168.11.0/24 and 192.168.12.0/24 using TY-SP1 and TY-SP3.
We start building the whole network as standard. This is very similar as stated in our first lab:
This is RR SP2 config:
! ip vrf CUST-A rd 100:1 route-target export 1:100 route-target import 1:100 ! interface Loopback0 ip address 10.0.2.1 255.255.255.255 ! interface GigabitEthernet1/0 description to SP1-PE ip address 10.0.12.2 255.255.255.0 negotiation auto mpls ip ! interface GigabitEthernet2/0 description to SP3-PE ip address 10.0.23.2 255.255.255.0 negotiation auto mpls ip ! interface FastEthernet3/0 description TO-LD-SP4 ip address 10.0.24.2 255.255.255.0 duplex auto speed auto mpls ip ! router ospf 1 log-adjacency-changes network 10.0.2.0 0.0.0.255 area 0 network 10.0.12.0 0.0.0.255 area 0 network 10.0.23.0 0.0.0.255 area 0 network 10.0.24.0 0.0.0.255 area 0 ! router bgp 100 no synchronization bgp log-neighbor-changes neighbor 10.0.1.1 remote-as 100 neighbor 10.0.1.1 update-source Loopback0 neighbor 10.0.1.1 route-reflector-client neighbor 10.0.3.1 remote-as 100 neighbor 10.0.3.1 update-source Loopback0 neighbor 10.0.3.1 route-reflector-client neighbor 10.0.4.1 remote-as 100 neighbor 10.0.4.1 update-source Loopback0 neighbor 10.0.4.1 route-reflector-client neighbor 10.0.5.1 remote-as 100 neighbor 10.0.5.1 update-source Loopback0 neighbor 10.0.5.1 route-reflector-client no auto-summary ! address-family vpnv4 neighbor 10.0.1.1 activate neighbor 10.0.1.1 send-community both neighbor 10.0.1.1 route-reflector-client neighbor 10.0.3.1 activate neighbor 10.0.3.1 send-community both neighbor 10.0.3.1 route-reflector-client neighbor 10.0.4.1 activate neighbor 10.0.4.1 send-community both neighbor 10.0.4.1 route-reflector-client neighbor 10.0.5.1 activate neighbor 10.0.5.1 send-community both neighbor 10.0.5.1 route-reflector-client exit-address-family ! address-family ipv4 vrf CUST-A no synchronization exit-address-family ! ! mpls ldp router-id Loopback0 force
The configs for the SP PE follow the same patern, this is TY-SP1:
! ip vrf CUST-A rd 100:1 route-target export 1:100 route-target import 1:100 ! interface Loopback0 ip address 10.0.1.1 255.255.255.255 ! interface FastEthernet0/0 description to HQ ip vrf forwarding CUST-A ip address 172.16.100.254 255.255.255.0 duplex half ! interface GigabitEthernet1/0 description to SP2-P ip address 10.0.12.1 255.255.255.0 negotiation auto mpls ip ! router ospf 1 log-adjacency-changes network 10.0.1.0 0.0.0.255 area 0 network 10.0.12.0 0.0.0.255 area 0 ! router bgp 100 no synchronization bgp log-neighbor-changes neighbor 10.0.2.1 remote-as 100 neighbor 10.0.2.1 update-source Loopback0 no auto-summary ! address-family vpnv4 neighbor 10.0.2.1 activate neighbor 10.0.2.1 send-community both exit-address-family ! address-family ipv4 vrf CUST-A neighbor 172.16.100.1 remote-as 65001 neighbor 172.16.100.1 activate neighbor 172.16.100.1 soft-reconfiguration inbound no synchronization exit-address-family ! mpls ldp router-id Loopback0 force !
Let’ see if LD-CE1 can ping our TY-C1
LD-CE1#traceroute 192.168.12.1 source 172.16.30.1 Type escape sequence to abort. Tracing the route to 192.168.12.1 1 172.16.101.254 8 msec 20 msec 8 msec 2 10.0.24.2 [MPLS: Labels 18/23 Exp 0] 40 msec 40 msec 36 msec 3 172.16.200.254 [MPLS: Label 23 Exp 0] 12 msec 32 msec 28 msec 4 172.16.200.1 60 msec 40 msec 40 msec 5 192.168.12.1 [AS 65001] 40 msec 60 msec 60 msec LD-CE1# LD-CE1# LD-CE1#ping 192.168.11.1 source 172.16.30.1 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 192.168.11.1, timeout is 2 seconds: Packet sent with a source address of 172.16.30.1 !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 44/54/72 ms LD-CE1# LD-CE1# LD-CE1# LD-CE1#sh LD-CE1#show ip rou LD-CE1#show ip route Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2 i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2 ia - IS-IS inter area, * - candidate default, U - per-user static route o - ODR, P - periodic downloaded static route Gateway of last resort is not set B 192.168.12.0/24 [20/0] via 172.16.101.254, 01:19:31 172.16.0.0/16 is variably subnetted, 2 subnets, 2 masks C 172.16.30.1/32 is directly connected, Loopback0 C 172.16.101.0/24 is directly connected, FastEthernet0/0 B 192.168.11.0/24 [20/0] via 172.16.101.254, 01:19:31 LD-CE1#
So, what do we see when everything is configured?
From SP2-RR, we see all BGP peers up to PEs and in the vpnv4 table we can see the TY prefixes 192.168.11.0/24 and 192.168.12.0/24. But only the path from TY-SP1 is preferred….
SP2#show ip ospf neighbor Neighbor ID Pri State Dead Time Address Interface 10.0.4.1 1 FULL/DR 00:00:39 10.0.24.1 FastEthernet3/0 10.0.3.1 1 FULL/DR 00:00:39 10.0.23.1 GigabitEthernet2/0 10.0.1.1 1 FULL/BDR 00:00:37 10.0.12.1 GigabitEthernet1/0 SP2# SP2# SP2#show ip bgp summary BGP router identifier 10.0.2.1, local AS number 100 BGP table version is 1, main routing table version 1 Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 10.0.1.1 4 100 98 111 1 0 0 01:25:16 0 10.0.3.1 4 100 93 108 1 0 0 01:25:05 0 10.0.4.1 4 100 96 114 1 0 0 00:55:06 0 10.0.5.1 4 100 29 32 1 0 0 00:28:02 0 SP2# SP2#show ip bgp vpnv4 all BGP table version is 9, local router ID is 10.0.2.1 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 100:1 (default for vrf CUST-A) *>i172.16.30.1/32 10.0.4.1 0 100 0 65002 i *>i192.168.11.0 10.0.1.1 0 100 0 65001 i * i 10.0.3.1 0 100 0 65001 i *>i192.168.12.0 10.0.1.1 0 100 0 65001 i * i 10.0.3.1 0 100 0 65001 i SP2#
Let confirm that the PE only receive the best prefix from the RR. So, from LD-SP4, we can see the paths to TY 192.168.11/12 via TY-SP1 only:
LD-SP4#show ip bgp vpnv4 all BGP table version is 18, local router ID is 10.0.4.1 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 100:1 (default for vrf CUST-A) *> 172.16.30.1/32 172.16.101.1 0 0 65002 i *>i192.168.11.0 10.0.1.1 0 100 0 65001 i *>i192.168.12.0 10.0.1.1 0 100 0 65001 i LD-SP4#
How do we make RR-SP2 to learn and advertise TY-SP1 and TY-SP3 paths. We need to use different RD in TY-SP1/3 respectively.
We have RD 100:1 assigned to CUST-A in all PEs. We are going to change that in TY-SP1/3 so RR will see two different VPNv4 prefixes for the same destination.
Let’s change TY-SP1 RD 100:1 to 100:101 and TY-SP3 to 100:102. Watch out as all routing config related to VRF CUST-A will disappear.
And what about the RT config? Do we have to change anything? Actually, we need to keep it the same (we need to retype it), nothing changes here. Keep in mind that RT is used to import/export vpnv4 prefixes into the VRF. The RD is not used to import/export so for that reason (as we are going to see) we could actually use any RD for a VRF in a PE.
Let’s see the changes for TY-SP1:
TY-SP1(config)#ip vrf CUST-A TY-SP1(config-vrf)#no rd 100:1 % "rd 100:1" for VRF CUST-A scheduled for deletion TY-SP1(config-vrf)# *Apr 27 22:28:48.347: %BGP-5-ADJCHANGE: neighbor 172.16.100.1 vpn vrf CUST-A Down Neighbor deleted TY-SP1(config-vrf)#rd 100:101 % Deletion of "rd" in progress; wait for it to complete TY-SP1(config-vrf)# TY-SP1(config-vrf)#rd 100:101 TY-SP1(config-vrf)#route-target export 100:1 TY-SP1(config-vrf)#route-target import 100:1 TY-SP1(config-vrf)#exit TY-SP1(config)#router bgp 100 TY-SP1(config-router)#address-family ipv4 vrf CUST-A TY-SP1(config-router-af)# neighbor 172.16.100.1 remote-as 65001 TY-SP1(config-router-af)# neighbor 172.16.100.1 activate TY-SP1(config-router-af)# neighbor 172.16.100.1 soft-reconfiguration inbound TY-SP1(config-router-af)# *Apr 27 22:33:50.571: %BGP-5-ADJCHANGE: neighbor 172.16.100.1 vpn vrf CUST-A Up TY-SP1(config-router-af)#
So after repeating the same step in TY-SP3 (using RD 100:102), let’s see what happens in RR-SP2:
SP2#show ip bgp vpnv4 all BGP table version is 51, local router ID is 10.0.2.1 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 100:1 (default for vrf CUST-A) *>i172.16.30.1/32 10.0.4.1 0 100 0 65002 i * i192.168.11.0 10.0.3.1 0 100 0 65001 i *>i 10.0.1.1 0 100 0 65001 i * i192.168.12.0 10.0.3.1 0 100 0 65001 i *>i 10.0.1.1 0 100 0 65001 i Route Distinguisher: 100:101 *>i192.168.11.0 10.0.1.1 0 100 0 65001 i *>i192.168.12.0 10.0.1.1 0 100 0 65001 i Route Distinguisher: 100:102 *>i192.168.11.0 10.0.3.1 0 100 0 65001 i *>i192.168.12.0 10.0.3.1 0 100 0 65001 i SP2#
Now we can see VPNv4 for 100:101 (TY-SP1) and 100:102 (TY-SP2)!!!
Ok, let’s what the other PE are seeing. In our case, let’s check LD-SP4:
LD-SP4#show ip bgp vpnv4 all BGP table version is 18, local router ID is 10.0.4.1 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 100:1 (default for vrf CUST-A) *> 172.16.30.1/32 172.16.101.1 0 0 65002 i * i192.168.11.0 10.0.3.1 0 100 0 65001 i *>i 10.0.1.1 0 100 0 65001 i * i192.168.12.0 10.0.3.1 0 100 0 65001 i *>i 10.0.1.1 0 100 0 65001 i Route Distinguisher: 100:101 *>i192.168.11.0 10.0.1.1 0 100 0 65001 i *>i192.168.12.0 10.0.1.1 0 100 0 65001 i Route Distinguisher: 100:102 *>i192.168.11.0 10.0.3.1 0 100 0 65001 i *>i192.168.12.0 10.0.3.1 0 100 0 65001 i LD-SP4# LD-SP4# LD-SP4#show ip route vrf CUST-A Routing Table: CUST-A Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2 i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2 ia - IS-IS inter area, * - candidate default, U - per-user static route o - ODR, P - periodic downloaded static route Gateway of last resort is not set B 192.168.12.0/24 [200/0] via 10.0.1.1, 00:01:05 172.16.0.0/16 is variably subnetted, 2 subnets, 2 masks B 172.16.30.1/32 [20/0] via 172.16.101.1, 00:33:47 C 172.16.101.0/24 is directly connected, FastEthernet0/0 B 192.168.11.0/24 [200/0] via 10.0.1.1, 00:01:05 LD-SP4#
So, LD-SP4 is receiving the VPNv4 100:101 and 100:102 from RR-SP2!!! That’s good, but we are still seeing the path to TY 192.168.11/12 prefixes via TY-SP1 (10.0.1.1) only.
So why BGP ECMP is not working? Because we have to enable it.
LD-SP4(config)#router bgp 100 LD-SP4(config-router)#address-family ipv4 vrf CUST-A LD-SP4(config-router-af)#maximum-paths eibgp 2 LD-SP4(config-router-af)# *Apr 27 22:58:25.447: BGP: VPNv4 Unicast multipath configuration changed *Apr 27 22:58:25.447: BGP-VPN(4): MPLS label changed for prefix 100:1:192.168.11.0/24 *Apr 27 22:58:25.447: BGP-VPN(4): multipath from neighbor 10.0.2.1 nexthop 10.0.3.1 new outlabel 24 *Apr 27 22:58:25.447: vpn: free local label 1048577 for remote prefix CUST-A:192.168.11.0/24 *Apr 27 22:58:25.447: vpn: get path labels: 100:1:192.168.11.0/255.255.255.0 *Apr 27 22:58:25.451: vpn(4): inlabel=nolabel, outlabel=22, outlabel owner=BGP *Apr 27 22:58:25.451: vpn(4): Announce labels to IPRM CUST-A:192.168.11.0/24 gw 10.0.1.1 inlabel=nolabel, outlabel=22 *Apr 27 22:58:25.451: BGP-VPN(4): MPLS label changed for prefix 100:1:192.168.12.0/24 *Apr 27 22:58:25.451: BGP-VPN(4): multipath from neighbor 10.0.2.1 nexthop 10.0.3.1 new outlabel 23 *Apr 27 22:58:25.451: vpn: free local label 1048577 for remote prefix CUST-A:192.168.12.0/24 *Apr 27 22:58:25.451: vpn: get path labels: 100:1:192.168.12.0/255.255.255.0 * LD-SP4(config-router-af)#endApr 27 22:58:25.451: vpn(4): inlabel=nolabel, outlabel=21, outlabel owner=BGP *Apr 27 22:58:25.451: vpn(4): Announce labels to IPRM CUST-A:192.168.12.0/24 gw 10.0.1.1 inlabel=nolabel, outlabel=21 *Apr 27 22:58:25.455: vpn: get path labels: 100:1:192.168.11.0/255.255.255.0 *Apr 27 22:58:25.459: vpn(4): inlabel=nolabel, outlabel=24, outlabel owner=BGP *Apr 27 22:58:25.459: vpn(4): Announce labels to IPRM CUST-A:192.168.11.0/24 gw 10.0.3.1 inlabel=nolabel, outlabel=24 *Apr 27 22:58:25.459: vpn(4): get path labels; 100:1:192.168.11.0/24 nexthop 10.0.3.1, not bestpath *Apr 27 22:58:25.475: vpn: get path labels: 100:1:192.168.12.0/255.255.255.0 *Apr 27 22:58:25.475: vpn(4): inlabel=nolabel, outlabel=23, outlabel owner=BGP *Apr 27 22:58:25.475: vpn(4): Announce labels to IPRM CUST-A:192.168.12.0/24 gw 10.0.3.1 inlabel=nolabel, outlabel=23 *Apr 27 22:58:25.479: vpn(4): get path labels; 100:1:192.168.12.0/24 nexthop 10.0.3.1, not bestpath LD-SP4(config-router-af)#end LD-SP4# *Apr 27 22:58:27.411: %SYS-5-CONFIG_I: Configured from console by console LD-SP4# LD-SP4# LD-SP4#show ip route vrf CUST-A Routing Table: CUST-A Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2 i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2 ia - IS-IS inter area, * - candidate default, U - per-user static route o - ODR, P - periodic downloaded static route Gateway of last resort is not set B 192.168.12.0/24 [200/0] via 10.0.3.1, 00:00:07 [200/0] via 10.0.1.1, 00:02:18 172.16.0.0/16 is variably subnetted, 2 subnets, 2 masks B 172.16.30.1/32 [20/0] via 172.16.101.1, 00:35:00 C 172.16.101.0/24 is directly connected, FastEthernet0/0 B 192.168.11.0/24 [200/0] via 10.0.3.1, 00:00:07 [200/0] via 10.0.1.1, 00:02:18 LD-SP4#
We finally got it! Our PE LD-SP4 is able to see two paths to TY prefixes!
In summary:
- We need to change the VRF RD in the PE we want to be participant in load-balancing
- We need to enable EIBGP ECMP