GNS3: Load-Balancing with Route Reflectors in a MPLS L3VPN network

I read once about how to do load-balancing when using Route-Reflectors (RR) in a MPLS L3VPN network. It is a insteresting topic because RRs only reflect the best prefixes to the its clients. So how we make the RR to send more than one?

So I built a GNS3 lab to work on this subject:

https://github.com/thomarite/mpls-rr

This is our scenario:

  • We have one customer vrf “CUST-A” with three locations: TY, LD and NY.
  • We are using BGP for PE-CE routing. Each site will use a different private ASN. Our SP is ASN 100.
  • TY has two connection to our SP so we want to make use of both of them.
  • We have a RR SP2 that is in line. So we need a full-mesh iBGP from all PE to SP2.
  • Our SP IGP is OSFP.
  • The goal is to make all other PE connected to CUST-A sites to be able to load-balance to TY site prefixes 192.168.11.0/24 and 192.168.12.0/24 using TY-SP1 and TY-SP3.

We start building the whole network as standard. This is very similar as stated in our first lab:

This is RR SP2 config:

!
ip vrf CUST-A
 rd 100:1 
 route-target export 1:100
 route-target import 1:100
!
interface Loopback0
 ip address 10.0.2.1 255.255.255.255
!         
interface GigabitEthernet1/0
 description to SP1-PE
 ip address 10.0.12.2 255.255.255.0
 negotiation auto
 mpls ip
!
interface GigabitEthernet2/0
 description to SP3-PE
 ip address 10.0.23.2 255.255.255.0
 negotiation auto
 mpls ip
!
interface FastEthernet3/0
 description TO-LD-SP4
 ip address 10.0.24.2 255.255.255.0
 duplex auto
 speed auto
 mpls ip
!
router ospf 1
 log-adjacency-changes
 network 10.0.2.0 0.0.0.255 area 0
 network 10.0.12.0 0.0.0.255 area 0
 network 10.0.23.0 0.0.0.255 area 0
 network 10.0.24.0 0.0.0.255 area 0
!
router bgp 100
 no synchronization
 bgp log-neighbor-changes
 neighbor 10.0.1.1 remote-as 100
 neighbor 10.0.1.1 update-source Loopback0
 neighbor 10.0.1.1 route-reflector-client
 neighbor 10.0.3.1 remote-as 100
 neighbor 10.0.3.1 update-source Loopback0
 neighbor 10.0.3.1 route-reflector-client
 neighbor 10.0.4.1 remote-as 100
 neighbor 10.0.4.1 update-source Loopback0
 neighbor 10.0.4.1 route-reflector-client
 neighbor 10.0.5.1 remote-as 100
 neighbor 10.0.5.1 update-source Loopback0
 neighbor 10.0.5.1 route-reflector-client
 no auto-summary
 !
 address-family vpnv4
  neighbor 10.0.1.1 activate
  neighbor 10.0.1.1 send-community both
  neighbor 10.0.1.1 route-reflector-client
  neighbor 10.0.3.1 activate
  neighbor 10.0.3.1 send-community both
  neighbor 10.0.3.1 route-reflector-client
  neighbor 10.0.4.1 activate
  neighbor 10.0.4.1 send-community both
  neighbor 10.0.4.1 route-reflector-client
  neighbor 10.0.5.1 activate
  neighbor 10.0.5.1 send-community both
  neighbor 10.0.5.1 route-reflector-client
 exit-address-family
 !
 address-family ipv4 vrf CUST-A
  no synchronization
 exit-address-family
!
!
mpls ldp router-id Loopback0 force

The configs for the SP PE follow the same patern, this is TY-SP1:

!
ip vrf CUST-A
 rd 100:1 
 route-target export 1:100
 route-target import 1:100
!
interface Loopback0
 ip address 10.0.1.1 255.255.255.255
!
interface FastEthernet0/0
 description to HQ
 ip vrf forwarding CUST-A
 ip address 172.16.100.254 255.255.255.0
 duplex half
!
interface GigabitEthernet1/0
 description to SP2-P
 ip address 10.0.12.1 255.255.255.0
 negotiation auto
 mpls ip
!
router ospf 1
 log-adjacency-changes
 network 10.0.1.0 0.0.0.255 area 0
 network 10.0.12.0 0.0.0.255 area 0
!
router bgp 100
 no synchronization
 bgp log-neighbor-changes
 neighbor 10.0.2.1 remote-as 100
 neighbor 10.0.2.1 update-source Loopback0
 no auto-summary
 !
 address-family vpnv4
  neighbor 10.0.2.1 activate
  neighbor 10.0.2.1 send-community both
 exit-address-family
 !
 address-family ipv4 vrf CUST-A
  neighbor 172.16.100.1 remote-as 65001
  neighbor 172.16.100.1 activate
  neighbor 172.16.100.1 soft-reconfiguration inbound
  no synchronization
 exit-address-family
!
mpls ldp router-id Loopback0 force
!

Let’ see if LD-CE1 can ping our TY-C1

LD-CE1#traceroute 192.168.12.1 source 172.16.30.1 

Type escape sequence to abort.
Tracing the route to 192.168.12.1

  1 172.16.101.254 8 msec 20 msec 8 msec
  2 10.0.24.2 [MPLS: Labels 18/23 Exp 0] 40 msec 40 msec 36 msec
  3 172.16.200.254 [MPLS: Label 23 Exp 0] 12 msec 32 msec 28 msec
  4 172.16.200.1 60 msec 40 msec 40 msec
  5 192.168.12.1 [AS 65001] 40 msec 60 msec 60 msec
LD-CE1#
LD-CE1#
LD-CE1#ping 192.168.11.1 source 172.16.30.1       

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.11.1, timeout is 2 seconds:
Packet sent with a source address of 172.16.30.1 
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 44/54/72 ms
LD-CE1#
LD-CE1#
LD-CE1#
LD-CE1#sh
LD-CE1#show ip rou
LD-CE1#show ip route 
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area 
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route

Gateway of last resort is not set

B    192.168.12.0/24 [20/0] via 172.16.101.254, 01:19:31
     172.16.0.0/16 is variably subnetted, 2 subnets, 2 masks
C       172.16.30.1/32 is directly connected, Loopback0
C       172.16.101.0/24 is directly connected, FastEthernet0/0
B    192.168.11.0/24 [20/0] via 172.16.101.254, 01:19:31
LD-CE1#

So, what do we see when everything is configured?

From SP2-RR, we see all BGP peers up to PEs and in the vpnv4 table we can see the TY prefixes 192.168.11.0/24 and 192.168.12.0/24. But only the path from TY-SP1 is preferred….

SP2#show ip ospf neighbor 

Neighbor ID     Pri   State           Dead Time   Address         Interface
10.0.4.1          1   FULL/DR         00:00:39    10.0.24.1       FastEthernet3/0
10.0.3.1          1   FULL/DR         00:00:39    10.0.23.1       GigabitEthernet2/0
10.0.1.1          1   FULL/BDR        00:00:37    10.0.12.1       GigabitEthernet1/0
SP2#
SP2#
SP2#show ip bgp summary 
BGP router identifier 10.0.2.1, local AS number 100
BGP table version is 1, main routing table version 1

Neighbor        V          AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.0.1.1        4        100      98     111        1    0    0 01:25:16        0
10.0.3.1        4        100      93     108        1    0    0 01:25:05        0
10.0.4.1        4        100      96     114        1    0    0 00:55:06        0
10.0.5.1        4        100      29      32        1    0    0 00:28:02        0
SP2#
SP2#show ip bgp vpnv4 all 
BGP table version is 9, local router ID is 10.0.2.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 100:1 (default for vrf CUST-A)
*>i172.16.30.1/32   10.0.4.1                 0    100      0 65002 i
*>i192.168.11.0     10.0.1.1                 0    100      0 65001 i
* i                 10.0.3.1                 0    100      0 65001 i
*>i192.168.12.0     10.0.1.1                 0    100      0 65001 i
* i                 10.0.3.1                 0    100      0 65001 i
SP2#

Let confirm that the PE only receive the best prefix from the RR. So, from LD-SP4, we can see the paths to TY 192.168.11/12 via TY-SP1 only:

LD-SP4#show ip bgp vpnv4 all 
BGP table version is 18, local router ID is 10.0.4.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 100:1 (default for vrf CUST-A)
*> 172.16.30.1/32   172.16.101.1             0             0 65002 i
*>i192.168.11.0     10.0.1.1                 0    100      0 65001 i
*>i192.168.12.0     10.0.1.1                 0    100      0 65001 i
LD-SP4#

How do we make RR-SP2 to learn and advertise TY-SP1 and TY-SP3 paths. We need to use different RD in TY-SP1/3 respectively.

We have RD 100:1 assigned to CUST-A in all PEs. We are going to change that in TY-SP1/3 so RR will see two different VPNv4 prefixes for the same destination.

Let’s change TY-SP1 RD 100:1 to 100:101 and TY-SP3 to 100:102. Watch out as all routing config related to VRF CUST-A will disappear.

And what about the RT config? Do we have to change anything? Actually, we need to keep it the same (we need to retype it), nothing changes here. Keep in mind that RT is used to import/export vpnv4 prefixes into the VRF. The RD is not used to import/export so for that reason (as we are going to see) we could actually use any RD for a VRF in a PE.

Let’s see the changes for TY-SP1:

TY-SP1(config)#ip vrf CUST-A
TY-SP1(config-vrf)#no rd 100:1
% "rd 100:1" for VRF CUST-A scheduled for deletion
TY-SP1(config-vrf)#
*Apr 27 22:28:48.347: %BGP-5-ADJCHANGE: neighbor 172.16.100.1 vpn vrf CUST-A Down Neighbor deleted
TY-SP1(config-vrf)#rd 100:101
% Deletion of "rd" in progress; wait for it to complete
TY-SP1(config-vrf)#
TY-SP1(config-vrf)#rd 100:101
TY-SP1(config-vrf)#route-target export 100:1
TY-SP1(config-vrf)#route-target import 100:1
TY-SP1(config-vrf)#exit
TY-SP1(config)#router bgp 100
TY-SP1(config-router)#address-family ipv4 vrf CUST-A 
TY-SP1(config-router-af)#  neighbor 172.16.100.1 remote-as 65001
TY-SP1(config-router-af)#  neighbor 172.16.100.1 activate
TY-SP1(config-router-af)#  neighbor 172.16.100.1 soft-reconfiguration inbound
TY-SP1(config-router-af)#
*Apr 27 22:33:50.571: %BGP-5-ADJCHANGE: neighbor 172.16.100.1 vpn vrf CUST-A Up 
TY-SP1(config-router-af)#

So after repeating the same step in TY-SP3 (using RD 100:102), let’s see what happens in RR-SP2:

SP2#show ip bgp vpnv4 all 
BGP table version is 51, local router ID is 10.0.2.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 100:1 (default for vrf CUST-A)
*>i172.16.30.1/32   10.0.4.1                 0    100      0 65002 i
* i192.168.11.0     10.0.3.1                 0    100      0 65001 i
*>i                 10.0.1.1                 0    100      0 65001 i
* i192.168.12.0     10.0.3.1                 0    100      0 65001 i
*>i                 10.0.1.1                 0    100      0 65001 i
Route Distinguisher: 100:101
*>i192.168.11.0     10.0.1.1                 0    100      0 65001 i
*>i192.168.12.0     10.0.1.1                 0    100      0 65001 i
Route Distinguisher: 100:102
*>i192.168.11.0     10.0.3.1                 0    100      0 65001 i
*>i192.168.12.0     10.0.3.1                 0    100      0 65001 i
SP2#

Now we can see VPNv4 for 100:101 (TY-SP1) and 100:102 (TY-SP2)!!!

Ok, let’s what the other PE are seeing. In our case, let’s check LD-SP4:

LD-SP4#show ip bgp vpnv4 all 
BGP table version is 18, local router ID is 10.0.4.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 100:1 (default for vrf CUST-A)
*> 172.16.30.1/32   172.16.101.1             0             0 65002 i
* i192.168.11.0     10.0.3.1                 0    100      0 65001 i
*>i                 10.0.1.1                 0    100      0 65001 i
* i192.168.12.0     10.0.3.1                 0    100      0 65001 i
*>i                 10.0.1.1                 0    100      0 65001 i
Route Distinguisher: 100:101
*>i192.168.11.0     10.0.1.1                 0    100      0 65001 i
*>i192.168.12.0     10.0.1.1                 0    100      0 65001 i
Route Distinguisher: 100:102
*>i192.168.11.0     10.0.3.1                 0    100      0 65001 i
*>i192.168.12.0     10.0.3.1                 0    100      0 65001 i
LD-SP4#
LD-SP4#
LD-SP4#show ip route vrf CUST-A

Routing Table: CUST-A
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area 
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route

Gateway of last resort is not set

B    192.168.12.0/24 [200/0] via 10.0.1.1, 00:01:05
     172.16.0.0/16 is variably subnetted, 2 subnets, 2 masks
B       172.16.30.1/32 [20/0] via 172.16.101.1, 00:33:47
C       172.16.101.0/24 is directly connected, FastEthernet0/0
B    192.168.11.0/24 [200/0] via 10.0.1.1, 00:01:05
LD-SP4#

So, LD-SP4 is receiving the VPNv4 100:101 and 100:102 from RR-SP2!!! That’s good, but we are still seeing the path to TY 192.168.11/12 prefixes via TY-SP1 (10.0.1.1) only.

So why BGP ECMP is not working? Because we have to enable it.

LD-SP4(config)#router bgp 100
LD-SP4(config-router)#address-family ipv4 vrf CUST-A
LD-SP4(config-router-af)#maximum-paths eibgp 2
LD-SP4(config-router-af)#
*Apr 27 22:58:25.447: BGP: VPNv4 Unicast multipath configuration changed
*Apr 27 22:58:25.447: BGP-VPN(4):  MPLS label changed for prefix 100:1:192.168.11.0/24
*Apr 27 22:58:25.447: BGP-VPN(4): multipath from neighbor 10.0.2.1 nexthop 10.0.3.1 new outlabel 24
*Apr 27 22:58:25.447: vpn: free local label 1048577 for remote prefix CUST-A:192.168.11.0/24
*Apr 27 22:58:25.447: vpn: get path labels: 100:1:192.168.11.0/255.255.255.0
*Apr 27 22:58:25.451: vpn(4): inlabel=nolabel, outlabel=22, outlabel owner=BGP
*Apr 27 22:58:25.451: vpn(4): Announce labels to IPRM CUST-A:192.168.11.0/24 gw 10.0.1.1 inlabel=nolabel, outlabel=22
*Apr 27 22:58:25.451: BGP-VPN(4):  MPLS label changed for prefix 100:1:192.168.12.0/24
*Apr 27 22:58:25.451: BGP-VPN(4): multipath from neighbor 10.0.2.1 nexthop 10.0.3.1 new outlabel 23
*Apr 27 22:58:25.451: vpn: free local label 1048577 for remote prefix CUST-A:192.168.12.0/24
*Apr 27 22:58:25.451: vpn: get path labels: 100:1:192.168.12.0/255.255.255.0
*
LD-SP4(config-router-af)#endApr 27 22:58:25.451: vpn(4): inlabel=nolabel, outlabel=21, outlabel owner=BGP
*Apr 27 22:58:25.451: vpn(4): Announce labels to IPRM CUST-A:192.168.12.0/24 gw 10.0.1.1 inlabel=nolabel, outlabel=21
*Apr 27 22:58:25.455: vpn: get path labels: 100:1:192.168.11.0/255.255.255.0
*Apr 27 22:58:25.459: vpn(4): inlabel=nolabel, outlabel=24, outlabel owner=BGP
*Apr 27 22:58:25.459: vpn(4): Announce labels to IPRM CUST-A:192.168.11.0/24 gw 10.0.3.1 inlabel=nolabel, outlabel=24
*Apr 27 22:58:25.459: vpn(4): get path labels; 100:1:192.168.11.0/24 nexthop 10.0.3.1, not bestpath
*Apr 27 22:58:25.475: vpn: get path labels: 100:1:192.168.12.0/255.255.255.0
*Apr 27 22:58:25.475: vpn(4): inlabel=nolabel, outlabel=23, outlabel owner=BGP
*Apr 27 22:58:25.475: vpn(4): Announce labels to IPRM CUST-A:192.168.12.0/24 gw 10.0.3.1 inlabel=nolabel, outlabel=23
*Apr 27 22:58:25.479: vpn(4): get path labels; 100:1:192.168.12.0/24 nexthop 10.0.3.1, not bestpath
LD-SP4(config-router-af)#end
LD-SP4#
*Apr 27 22:58:27.411: %SYS-5-CONFIG_I: Configured from console by console
LD-SP4#
LD-SP4#
LD-SP4#show ip route vrf CUST-A

Routing Table: CUST-A
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area 
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route

Gateway of last resort is not set

B    192.168.12.0/24 [200/0] via 10.0.3.1, 00:00:07
                     [200/0] via 10.0.1.1, 00:02:18
     172.16.0.0/16 is variably subnetted, 2 subnets, 2 masks
B       172.16.30.1/32 [20/0] via 172.16.101.1, 00:35:00
C       172.16.101.0/24 is directly connected, FastEthernet0/0
B    192.168.11.0/24 [200/0] via 10.0.3.1, 00:00:07
                     [200/0] via 10.0.1.1, 00:02:18
LD-SP4#

We finally got it! Our PE LD-SP4 is able to see two paths to TY prefixes!

In summary:

  • We need to change the VRF RD in the PE we want to be participant in load-balancing
  • We need to enable EIBGP ECMP