Finally I am trying to setup MPLS L3VPN.
Again, I am following the author post but adapting it to my environment using libvirt instead of VirtualBox and Debian10 as VM. All my data is here.
This is the diagram for the lab:
Difference from lab3 and lab2. We have P1, that is a pure P router, only handling labels, it doesnt do any BGP.
This time all devices FRR config are generated automatically via gen_frr_config.py (in lab2 all config was manual).
Again the environment is configured via Vagrant file + l3vpn_provisioning script. This is mix of lab2 (install FRR), lab3 (define VRFs) and lab1 (configure MPLS at linux level).
So after some tuning, everything is installed, routing looks correct (although I dont know why but I have to reload FRR to get the proper generated BGP config in PE1 and PE2. P1 is fine).
So let’s see PE1:
IGP (IS-IS) is up:
PE1# show isis neighbor Area ISIS: System Id Interface L State Holdtime SNPA P1 ens8 2 Up 30 2020.2020.2020 PE1# PE1# exit root@PE1:/home/vagrant#
BGP is up to PE2 and we can see routes received in AF IPv4VPN:
PE1# PE1# show bgp summary IPv4 Unicast Summary: BGP router identifier 172.20.5.1, local AS number 65010 vrf-id 0 BGP table version 0 RIB entries 0, using 0 bytes of memory Peers 1, using 21 KiB of memory Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt 172.20.5.2 4 65010 111 105 0 0 0 01:39:14 0 0 Total number of neighbors 1 IPv4 VPN Summary: BGP router identifier 172.20.5.1, local AS number 65010 vrf-id 0 BGP table version 0 RIB entries 11, using 2112 bytes of memory Peers 1, using 21 KiB of memory Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt 172.20.5.2 4 65010 111 105 0 0 0 01:39:14 2 2 Total number of neighbors 1 PE1#
Check routing tables, we can see prefixes in both VRFs, so that’s good. And the labels needed.
PE1# show ip route vrf all Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP, T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR, f - OpenFabric, > - selected route, * - FIB route, q - queued, r - rejected, b - backup VRF default: C>* 172.20.5.1/32 is directly connected, lo, 02:19:16 I>* 172.20.5.2/32 [115/30] via 192.168.66.102, ens8, label 17, weight 1, 02:16:10 I>* 172.20.5.5/32 [115/20] via 192.168.66.102, ens8, label implicit-null, weight 1, 02:18:34 I 192.168.66.0/24 [115/20] via 192.168.66.102, ens8 inactive, weight 1, 02:18:34 C>* 192.168.66.0/24 is directly connected, ens8, 02:19:16 I>* 192.168.77.0/24 [115/20] via 192.168.66.102, ens8, label implicit-null, weight 1, 02:18:34 C>* 192.168.121.0/24 is directly connected, ens5, 02:19:16 K>* 192.168.121.1/32 [0/1024] is directly connected, ens5, 02:19:16 VRF vrf_cust1: C>* 192.168.11.0/24 is directly connected, ens6, 02:19:05 B> 192.168.23.0/24 [200/0] via 172.20.5.2 (vrf default) (recursive), label 80, weight 1, 02:13:32 via 192.168.66.102, ens8 (vrf default), label 17/80, weight 1, 02:13:32 VRF vrf_cust2: C>* 192.168.12.0/24 is directly connected, ens7, 02:19:05 B> 192.168.24.0/24 [200/0] via 172.20.5.2 (vrf default) (recursive), label 81, weight 1, 02:13:32 via 192.168.66.102, ens8 (vrf default), label 17/81, weight 1, 02:13:32 PE1#
Now check LDP and MPLS labels. Everything looks sane. We have LDP labels for P1 (17) and PE2 (18). And labels for each VFR.
PE1# show mpls table Inbound Label Type Nexthop Outbound Label 16 LDP 192.168.66.102 implicit-null 17 LDP 192.168.66.102 implicit-null 18 LDP 192.168.66.102 17 80 BGP vrf_cust1 - 81 BGP vrf_cust2 - PE1# PE1# show mpls ldp neighbor AF ID State Remote Address Uptime ipv4 172.20.5.5 OPERATIONAL 172.20.5.5 02:20:20 PE1# PE1# PE1# show mpls ldp binding AF Destination Nexthop Local Label Remote Label In Use ipv4 172.20.5.1/32 172.20.5.5 imp-null 16 no ipv4 172.20.5.2/32 172.20.5.5 18 17 yes ipv4 172.20.5.5/32 172.20.5.5 16 imp-null yes ipv4 192.168.11.0/24 0.0.0.0 imp-null - no ipv4 192.168.12.0/24 0.0.0.0 imp-null - no ipv4 192.168.66.0/24 172.20.5.5 imp-null imp-null no ipv4 192.168.77.0/24 172.20.5.5 17 imp-null yes ipv4 192.168.121.0/24 172.20.5.5 imp-null imp-null no PE1#
Similar view happens in PE2.
From P1 that is our P router. We only care about LDP and ISIS
P1# P1# show mpls table Inbound Label Type Nexthop Outbound Label 16 LDP 192.168.66.101 implicit-null 17 LDP 192.168.77.101 implicit-null P1# show mpls ldp neighbor AF ID State Remote Address Uptime ipv4 172.20.5.1 OPERATIONAL 172.20.5.1 02:23:55 ipv4 172.20.5.2 OPERATIONAL 172.20.5.2 02:21:01 P1# P1# show isis neighbor Area ISIS: System Id Interface L State Holdtime SNPA PE1 ens6 2 Up 28 2020.2020.2020 PE2 ens7 2 Up 29 2020.2020.2020 P1# P1# show ip route Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP, T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR, f - OpenFabric, > - selected route, * - FIB route, q - queued, r - rejected, b - backup K>* 0.0.0.0/0 [0/1024] via 192.168.121.1, ens5, src 192.168.121.253, 02:24:45 I>* 172.20.5.1/32 [115/20] via 192.168.66.101, ens6, label implicit-null, weight 1, 02:24:04 I>* 172.20.5.2/32 [115/20] via 192.168.77.101, ens7, label implicit-null, weight 1, 02:21:39 C>* 172.20.5.5/32 is directly connected, lo, 02:24:45 I 192.168.66.0/24 [115/20] via 192.168.66.101, ens6 inactive, weight 1, 02:24:04 C>* 192.168.66.0/24 is directly connected, ens6, 02:24:45 I 192.168.77.0/24 [115/20] via 192.168.77.101, ens7 inactive, weight 1, 02:21:39 C>* 192.168.77.0/24 is directly connected, ens7, 02:24:45 C>* 192.168.121.0/24 is directly connected, ens5, 02:24:45 K>* 192.168.121.1/32 [0/1024] is directly connected, ens5, 02:24:45 P1#
So as usual, let’s try to test connectivity. Will ping from CE1 (connected to PE1) to CE3 (connected to PE2) that belong to the same VRF vrf_cust1.
First of all, I had to modify iptables in my host to avoid unnecessary NAT (iptables masquerade) between CE1 and CE3.
# iptables -t nat -vnL LIBVIRT_PRT --line-numbers Chain LIBVIRT_PRT (1 references) num pkts bytes target prot opt in out source destination 1 15 1451 RETURN all -- * * 192.168.77.0/24 224.0.0.0/24 2 0 0 RETURN all -- * * 192.168.77.0/24 255.255.255.255 3 0 0 MASQUERADE tcp -- * * 192.168.77.0/24 !192.168.77.0/24 masq ports: 1024-65535 4 18 3476 MASQUERADE udp -- * * 192.168.77.0/24 !192.168.77.0/24 masq ports: 1024-65535 5 0 0 MASQUERADE all -- * * 192.168.77.0/24 !192.168.77.0/24 6 13 1754 RETURN all -- * * 192.168.122.0/24 224.0.0.0/24 7 0 0 RETURN all -- * * 192.168.122.0/24 255.255.255.255 8 0 0 MASQUERADE tcp -- * * 192.168.122.0/24 !192.168.122.0/24 masq ports: 1024-65535 9 0 0 MASQUERADE udp -- * * 192.168.122.0/24 !192.168.122.0/24 masq ports: 1024-65535 10 0 0 MASQUERADE all -- * * 192.168.122.0/24 !192.168.122.0/24 11 24 2301 RETURN all -- * * 192.168.11.0/24 224.0.0.0/24 12 0 0 RETURN all -- * * 192.168.11.0/24 255.255.255.255 13 0 0 MASQUERADE tcp -- * * 192.168.11.0/24 !192.168.11.0/24 masq ports: 1024-65535 14 23 4476 MASQUERADE udp -- * * 192.168.11.0/24 !192.168.11.0/24 masq ports: 1024-65535 15 1 84 MASQUERADE all -- * * 192.168.11.0/24 !192.168.11.0/24 16 29 2541 RETURN all -- * * 192.168.121.0/24 224.0.0.0/24 17 0 0 RETURN all -- * * 192.168.121.0/24 255.255.255.255 18 36 2160 MASQUERADE tcp -- * * 192.168.121.0/24 !192.168.121.0/24 masq ports: 1024-65535 19 65 7792 MASQUERADE udp -- * * 192.168.121.0/24 !192.168.121.0/24 masq ports: 1024-65535 20 0 0 MASQUERADE all -- * * 192.168.121.0/24 !192.168.121.0/24 21 20 2119 RETURN all -- * * 192.168.24.0/24 224.0.0.0/24 22 0 0 RETURN all -- * * 192.168.24.0/24 255.255.255.255 23 0 0 MASQUERADE tcp -- * * 192.168.24.0/24 !192.168.24.0/24 masq ports: 1024-65535 24 21 4076 MASQUERADE udp -- * * 192.168.24.0/24 !192.168.24.0/24 masq ports: 1024-65535 25 0 0 MASQUERADE all -- * * 192.168.24.0/24 !192.168.24.0/24 26 20 2119 RETURN all -- * * 192.168.23.0/24 224.0.0.0/24 27 0 0 RETURN all -- * * 192.168.23.0/24 255.255.255.255 28 1 60 MASQUERADE tcp -- * * 192.168.23.0/24 !192.168.23.0/24 masq ports: 1024-65535 29 20 3876 MASQUERADE udp -- * * 192.168.23.0/24 !192.168.23.0/24 masq ports: 1024-65535 30 1 84 MASQUERADE all -- * * 192.168.23.0/24 !192.168.23.0/24 31 25 2389 RETURN all -- * * 192.168.66.0/24 224.0.0.0/24 32 0 0 RETURN all -- * * 192.168.66.0/24 255.255.255.255 33 0 0 MASQUERADE tcp -- * * 192.168.66.0/24 !192.168.66.0/24 masq ports: 1024-65535 34 23 4476 MASQUERADE udp -- * * 192.168.66.0/24 !192.168.66.0/24 masq ports: 1024-65535 35 0 0 MASQUERADE all -- * * 192.168.66.0/24 !192.168.66.0/24 36 24 2298 RETURN all -- * * 192.168.12.0/24 224.0.0.0/24 37 0 0 RETURN all -- * * 192.168.12.0/24 255.255.255.255 38 0 0 MASQUERADE tcp -- * * 192.168.12.0/24 !192.168.12.0/24 masq ports: 1024-65535 39 23 4476 MASQUERADE udp -- * * 192.168.12.0/24 !192.168.12.0/24 masq ports: 1024-65535 40 0 0 MASQUERADE all -- * * 192.168.12.0/24 !192.168.12.0/24 # # iptables -t nat -I LIBVIRT_PRT 13 -s 192.168.11.0/24 -d 192.168.23.0/24 -j RETURN # iptables -t nat -I LIBVIRT_PRT 29 -s 192.168.23.0/24 -d 192.168.11.0/24 -j RETURN
Ok, staring pinging from CE1 to CE3:
vagrant@CE1:~$ ping 192.168.23.102 PING 192.168.23.102 (192.168.23.102) 56(84) bytes of data.
No good. Let’s check what the next hop, PE1, is doing. It seem it is sending the traffic double encapsulated to P1 as expected
root@PE1:/home/vagrant# tcpdump -i ens8 ... 20:29:16.648325 MPLS (label 17, exp 0, ttl 63) (label 80, exp 0, [S], ttl 63) IP 192.168.11.102 > 192.168.23.102: ICMP echo request, id 2298, seq 2627, length 64 20:29:17.672287 MPLS (label 17, exp 0, ttl 63) (label 80, exp 0, [S], ttl 63) IP 192.168.11.102 > 192.168.23.102: ICMP echo request, id 2298, seq 2628, length 64 ...
Let’s check next hop, P1. I can see it is sending the traffic to PE2 doing PHP, so removing the top label (LDP) and only leaving the BGP label:
root@PE2:/home/vagrant# tcpdump -i ens8 ... 20:29:16.648176 MPLS (label 80, exp 0, [S], ttl 63) IP 192.168.11.102 > 192.168.23.102: ICMP echo request, id 2298, seq 2627, length 64 20:29:17.671968 MPLS (label 80, exp 0, [S], ttl 63) IP 192.168.11.102 > 192.168.23.102: ICMP echo request, id 2298, seq 2628, length 64 ...
But then PE2 is not sending anything to CE3. I can’t see anything in the links:
root@CE3:/home/vagrant# tcpdump -i ens6 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on ens6, link-type EN10MB (Ethernet), capture size 262144 bytes 20:32:03.174796 STP 802.1d, Config, Flags [none], bridge-id 8000.52:54:00:e2:cb:54.8001, length 35 20:32:05.158761 STP 802.1d, Config, Flags [none], bridge-id 8000.52:54:00:e2:cb:54.8001, length 35 20:32:07.174742 STP 802.1d, Config, Flags [none], bridge-id 8000.52:54:00:e2:cb:54.8001, length 35
I have double-checked the configs. All routing and config looks sane in PE2:
vagrant@PE2:~$ ip route default via 192.168.121.1 dev ens5 proto dhcp src 192.168.121.31 metric 1024 172.20.5.1 encap mpls 16 via 192.168.77.102 dev ens8 proto isis metric 20 172.20.5.5 via 192.168.77.102 dev ens8 proto isis metric 20 192.168.66.0/24 via 192.168.77.102 dev ens8 proto isis metric 20 192.168.77.0/24 dev ens8 proto kernel scope link src 192.168.77.101 192.168.121.0/24 dev ens5 proto kernel scope link src 192.168.121.31 192.168.121.1 dev ens5 proto dhcp scope link src 192.168.121.31 metric 1024 vagrant@PE2:~$ vagrant@PE2:~$ ip -4 a 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet 172.20.5.2/32 scope global lo valid_lft forever preferred_lft forever 2: ens5: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 inet 192.168.121.31/24 brd 192.168.121.255 scope global dynamic ens5 valid_lft 2524sec preferred_lft 2524sec 3: ens6: mtu 1500 qdisc pfifo_fast master vrf_cust1 state UP group default qlen 1000 inet 192.168.23.101/24 brd 192.168.23.255 scope global ens6 valid_lft forever preferred_lft forever 4: ens7: mtu 1500 qdisc pfifo_fast master vrf_cust2 state UP group default qlen 1000 inet 192.168.24.101/24 brd 192.168.24.255 scope global ens7 valid_lft forever preferred_lft forever 5: ens8: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 inet 192.168.77.101/24 brd 192.168.77.255 scope global ens8 valid_lft forever preferred_lft forever vagrant@PE2:~$ vagrant@PE2:~$ vagrant@PE2:~$ vagrant@PE2:~$ vagrant@PE2:~$ ip -M route 16 as to 16 via inet 192.168.77.102 dev ens8 proto ldp 17 via inet 192.168.77.102 dev ens8 proto ldp 18 via inet 192.168.77.102 dev ens8 proto ldp vagrant@PE2:~$ vagrant@PE2:~$ ip route show table 10 blackhole default 192.168.11.0/24 encap mpls 16/80 via 192.168.77.102 dev ens8 proto bgp metric 20 broadcast 192.168.23.0 dev ens6 proto kernel scope link src 192.168.23.101 192.168.23.0/24 dev ens6 proto kernel scope link src 192.168.23.101 local 192.168.23.101 dev ens6 proto kernel scope host src 192.168.23.101 broadcast 192.168.23.255 dev ens6 proto kernel scope link src 192.168.23.101 vagrant@PE2:~$ vagrant@PE2:~$ vagrant@PE2:~$ ip vrf Name Table vrf_cust1 10 vrf_cust2 20 vagrant@PE2:~$ root@PE2:/home/vagrant# sysctl -a | grep mpls net.mpls.conf.ens5.input = 0 net.mpls.conf.ens6.input = 0 net.mpls.conf.ens7.input = 0 net.mpls.conf.ens8.input = 1 net.mpls.conf.lo.input = 0 net.mpls.conf.vrf_cust1.input = 0 net.mpls.conf.vrf_cust2.input = 0 net.mpls.default_ttl = 255 net.mpls.ip_ttl_propagate = 1 net.mpls.platform_labels = 100000 root@PE2:/home/vagrant# root@PE2:/home/vagrant# lsmod | grep mpls mpls_iptunnel 16384 3 mpls_router 36864 1 mpls_iptunnel ip_tunnel 24576 1 mpls_router root@PE2:/home/vagrant#
So I am a bit puzzled the last couple of weeks about this issue. I was thinking that iptables was fooling me again and was dropping the traffic somehow but as far as I can see. PE2 is not sending anything and I dont really know how to troubleshoot FRR in this case. I have asked for help in the FRR list. Let’s see how it goes. I think I am doing something wrong because I am not doing anything new.