SR and TI-LFA

Segment Routing (SR) and Topology Independent Loop Free Alternates (TI-LFA)

Intro

As part of having a MPLS SR lab, I wanted to test FRR (Fast Rerouting) solutions. Arista provides support for FRR TI-LFA based on this link. Unfortunately, if you are not a customer you can’t see that 🙁

But there are other links where you can read about TI-LFA. The two from juniper confuses me when calculating P/Q groups in pre-converge time…

https://blogs.juniper.net/en-us/industry-solutions-and-trends/segment-routing-sr-and-topology-independent-loop-free-alternates-ti-lfa

https://storage.googleapis.com/site-media-prod/meetings/NANOG79/2196/20200530_Bonica_The_Evolution_Of_v1.pdf

The documents above explain the evolution from Loop Free Alternates (LFA) to Remote LFA (RLFA) and finally to TI-LFA.

TI-LFA overcomes the limitations of RLFA using SR paths as repair tunnels.

As well, I have tried to read IETF draft and I didn’t understand things better 🙁

And I doubt I am going to improve it here 🙂

As well, Cisco has good presentations (longer and denser) about SR and TI-LFA.

https://www.ciscolive.com/c/dam/r/ciscolive/us/docs/2016/pdf/BRKRST-3020.pdf

https://www.segment-routing.net/tutorials/2016-09-27-topology-independent-lfa-ti-lfa/

Juniper docs mention always “pre-convergence” but Cisco uses “post-convergence”. I think “post” it is more clear.

EOS TI-LFA Limitations

  • Backup paths are not computed for prefix segments that do not have a host mask (/32 for v4 and /128 for v6).
  • When TI-LFA is configured, the number of anycast segments generated by a node cannot exceed 10.
  • Computing TI-LFA backup paths for proxy node segments is not supported.
  • Backup paths are not computed for node segments corresponding to multi-homed prefixes. The multi-homing could be the result of them being anycast node segments, loopback interfaces on different routers advertising SIDs for the same prefix, node segments leaked between levels and thus being seen as originated from multiple L1-L2 routers.
  • Backup paths are only computed for segments that are non-ECMP.
  • Only IS-IS interfaces that are using the point-to-point network type are eligible for protection.
  • The backup paths are only computed with respect to link/node failure constraints. SRLG constraint is not yet supported.
  • Link/node protection only supported in the default VRF owing to the lack of non-default VRF support for IS-IS segment-routing.
  • Backup paths are computed in the same IS-IS level topology as the primary path.
  • Even with IS-IS GR configured, ASU2, SSO, agent restart are not hitless events for IS-IS SR LFIB routes or tunnels being
    protected by backup paths.

LAB

Based on this, I built a lab using 4.24.1.1F 64 bits on EVE-NG. All links have default ISIS cost of 10 (loopbacks are 1) and we have TI-LFA node-protection enabled globally.

Fig1. SR TI-LFA Lab

The config are quite simple. This is l1r9. The only change is the IP addressing. The links in the diagram show the third octet of the link address range.

!
service routing protocols model multi-agent
!
hostname l1r9
!
spanning-tree mode mstp
!
aaa authorization exec default local
!
no aaa root
!
vrf instance MGMT
!
interface Ethernet1
no switchport
ip address 10.0.10.2/30
isis enable CORE
isis network point-to-point
!
interface Ethernet2
no switchport
ip address 10.0.11.2/30
isis enable CORE
isis network point-to-point
!
interface Ethernet3
no switchport
ip address 10.0.12.1/30
isis enable CORE
isis network point-to-point
!
interface Ethernet4
no switchport
ip address 10.0.13.1/30
isis enable CORE
isis network point-to-point
!
interface Loopback1
description CORE Loopback
ip address 10.0.0.9/32
node-segment ipv4 index 9
isis enable CORE
isis metric 1
!
interface Management1
vrf MGMT
ip address 192.168.249.18/24
!
ip routing
ip routing vrf MGMT
!
ip route vrf MGMT 0.0.0.0/0 192.168.249.1
!
mpls ip
!
mpls label range isis-sr 800000 65536
!
router isis CORE
net 49.0000.0001.0010.0000.0000.0009.00
is-type level-2
log-adjacency-changes
timers local-convergence-delay protected-prefixes
set-overload-bit on-startup wait-for-bgp
!
address-family ipv4 unicast
bfd all-interfaces
fast-reroute ti-lfa mode node-protection
!
segment-routing mpls
router-id 10.0.0.9
no shutdown
adjacency-segment allocation sr-peers backup-eligible
!
management api http-commands
protocol unix-socket
no shutdown
!
vrf MGMT
no shutdown
!

Using this script (using nornir/napalm), I gather the output of all these commands from all routers:

"show isis segment-routing prefix-segments" -> shows if protection is enabled for these segments

"show isis segment-routing adjacency-segments" -> shows is protection is enabled for these segments

"show isis interface" -> shows state of protection configured

"show isis ti-lfa path" -> shows the repair path with the list of all the system IDs from the P-node to the Q-node for every destination/constraint tuple. You will see that even though node protection is configured a link protecting LFA is computed too. This is to fallback to link protecting LFAs whenever the node protecting LFA becomes unavailable.

"show isis ti-lfa tunnel" -> The TI-LFA repair tunnels are just internal constructs that are shared by multiple LFIB routes that compute similar repair paths. This command displays TI-LFA repair tunnels with the primary and backup via information.

"show isis segment-routing tunnel" -> command displays all the IS-IS SR tunnels. The field ‘ TI-LFA tunnel index ’ shows the index of the TI-LFA tunnel protecting the SR tunnel. The same TI-LFA tunnel that protects the LFIB route also protects the corresponding IS-IS SR tunnel.

"show tunnel fib" -> displays tunnels programmed in the tunnel FIB also includes the TI-LFA tunnels along with protected IS-IS SR tunnels.

"show mpls lfib route" -> displays the backup information along with the primary vias for all node/adjacency segments that have TI-LFA backup paths computed.

"show ip route" -> When services like LDP pseudowires, BGP LU, L2 EVPN or L3 MPLS VPN use IS-IS SR tunnels as an underlay, they are automatically protected by TI-LFA tunnels that protect the IS-IS SR tunnels. The ‘show ip route’ command displays the hierarchy of the overlay-underlay-TI-LFA tunnels like below.

This is the output of l1r3 in the initial state (no failures):

/////////////////////////////////////////////////////////////////////////
///                               Device: l1r3                         //      /////////////////////////////////////////////////////////////////////////

command = show isis segment-routing prefix-segments


System ID: 0000.0000.0003			Instance: 'CORE'
SR supported Data-plane: MPLS			SR Router ID: 10.0.0.3

Node: 11     Proxy-Node: 0      Prefix: 0       Total Segments: 11

Flag Descriptions: R: Re-advertised, N: Node Segment, P: no-PHP
                   E: Explicit-NULL, V: Value, L: Local
Segment status codes: * - Self originated Prefix, L1 - level 1, L2 - level 2
  Prefix                      SID Type       Flags                   System ID       Level Protection
  ------------------------- ----- ---------- ----------------------- --------------- ----- ----------
  10.0.0.1/32                   1 Node       R:0 N:1 P:0 E:0 V:0 L:0 0000.0000.0001  L2    node      
  10.0.0.2/32                   2 Node       R:0 N:1 P:0 E:0 V:0 L:0 0000.0000.0002  L2    node      
* 10.0.0.3/32                   3 Node       R:0 N:1 P:0 E:0 V:0 L:0 0000.0000.0003  L2    unprotected
  10.0.0.4/32                   4 Node       R:0 N:1 P:0 E:0 V:0 L:0 0000.0000.0004  L2    node      
  10.0.0.5/32                   5 Node       R:0 N:1 P:0 E:0 V:0 L:0 0000.0000.0005  L2    node      
  10.0.0.6/32                   6 Node       R:0 N:1 P:0 E:0 V:0 L:0 0000.0000.0006  L2    node      
  10.0.0.7/32                   7 Node       R:0 N:1 P:0 E:0 V:0 L:0 0000.0000.0007  L2    node      
  10.0.0.8/32                   8 Node       R:0 N:1 P:0 E:0 V:0 L:0 0000.0000.0008  L2    node      
  10.0.0.9/32                   9 Node       R:0 N:1 P:0 E:0 V:0 L:0 0000.0000.0009  L2    node      
  10.0.0.10/32                 10 Node       R:0 N:1 P:0 E:0 V:0 L:0 0000.0000.0010  L2    node      
  10.0.0.11/32                 11 Node       R:0 N:1 P:0 E:0 V:0 L:0 0000.0000.0011  L2    node      

================================================================================

command = show isis segment-routing adjacency-segments


System ID: l1r3			Instance: CORE
SR supported Data-plane: MPLS			SR Router ID: 10.0.0.3
Adj-SID allocation mode: SR-adjacencies
Adj-SID allocation pool: Base: 100000     Size: 16384
Adjacency Segment Count: 4
Flag Descriptions: F: Ipv6 address family, B: Backup, V: Value
                   L: Local, S: Set

Segment Status codes: L1 - Level-1 adjacency, L2 - Level-2 adjacency, P2P - Point-to-Point adjacency, LAN - Broadcast adjacency

Locally Originated Adjacency Segments
Adj IP Address  Local Intf     SID   SID Source                 Flags     Type  
--------------- ----------- ------- ------------ --------------------- -------- 
      10.0.1.1         Et1  100000      Dynamic   F:0 B:1 V:1 L:1 S:0   P2P L2  
      10.0.2.1         Et2  100001      Dynamic   F:0 B:1 V:1 L:1 S:0   P2P L2  
      10.0.5.2         Et4  100002      Dynamic   F:0 B:1 V:1 L:1 S:0   P2P L2  
      10.0.3.2         Et3  100003      Dynamic   F:0 B:1 V:1 L:1 S:0   P2P L2  

Protection 
---------- 
      node 
      node 
      node 
      node 


================================================================================

command = show isis interface


IS-IS Instance: CORE VRF: default

  Interface Loopback1:
    Index: 12 SNPA: 0:0:0:0:0:0
    MTU: 65532 Type: loopback
    Area Proxy Boundary is Disabled
    Node segment Index IPv4: 3
    BFD IPv4 is Enabled
    BFD IPv6 is Disabled
    Hello Padding is Enabled
    Level 2:
      Metric: 1 (Passive Interface)
      Authentication mode: None
      TI-LFA protection is disabled for IPv4
      TI-LFA protection is disabled for IPv6
  Interface Ethernet1:
    Index: 13 SNPA: P2P
    MTU: 1497 Type: point-to-point
    Area Proxy Boundary is Disabled
    BFD IPv4 is Enabled
    BFD IPv6 is Disabled
    Hello Padding is Enabled
    Level 2:
      Metric: 10, Number of adjacencies: 1
      Link-ID: 0D
      Authentication mode: None
      TI-LFA node protection is enabled for the following IPv4 segments: node segments, adjacency segments
      TI-LFA protection is disabled for IPv6
  Interface Ethernet2:
    Index: 14 SNPA: P2P
    MTU: 1497 Type: point-to-point
    Area Proxy Boundary is Disabled
    BFD IPv4 is Enabled
    BFD IPv6 is Disabled
    Hello Padding is Enabled
    Level 2:
      Metric: 10, Number of adjacencies: 1
      Link-ID: 0E
      Authentication mode: None
      TI-LFA node protection is enabled for the following IPv4 segments: node segments, adjacency segments
      TI-LFA protection is disabled for IPv6
  Interface Ethernet3:
    Index: 15 SNPA: P2P
    MTU: 1497 Type: point-to-point
    Area Proxy Boundary is Disabled
    BFD IPv4 is Enabled
    BFD IPv6 is Disabled
    Hello Padding is Enabled
    Level 2:
      Metric: 10, Number of adjacencies: 1
      Link-ID: 0F
      Authentication mode: None
      TI-LFA node protection is enabled for the following IPv4 segments: node segments, adjacency segments
      TI-LFA protection is disabled for IPv6
  Interface Ethernet4:
    Index: 16 SNPA: P2P
    MTU: 1497 Type: point-to-point
    Area Proxy Boundary is Disabled
    BFD IPv4 is Enabled
    BFD IPv6 is Disabled
    Hello Padding is Enabled
    Level 2:
      Metric: 10, Number of adjacencies: 1
      Link-ID: 10
      Authentication mode: None
      TI-LFA node protection is enabled for the following IPv4 segments: node segments, adjacency segments
      TI-LFA protection is disabled for IPv6

================================================================================

command = show isis ti-lfa path

TI-LFA paths for IPv4 address family
   Topo-id: Level-2
   Destination       Constraint                     Path           
----------------- --------------------------------- -------------- 
   l1r2              exclude node 0000.0000.0002    Path not found 
                     exclude Ethernet2              l1r6           
   l1r8              exclude Ethernet4              l1r4           
                     exclude node 0000.0000.0007    l1r4           
   l1r9              exclude Ethernet4              l1r4           
                     exclude node 0000.0000.0007    l1r4           
   l1r11             exclude Ethernet4              l1r4           
                     exclude node 0000.0000.0007    l1r4           
   l1r10             exclude Ethernet3              l1r7           
                     exclude node 0000.0000.0004    l1r7           
   l1r1              exclude node 0000.0000.0001    Path not found 
                     exclude Ethernet1              Path not found 
   l1r6              exclude Ethernet4              l1r2           
                     exclude node 0000.0000.0007    l1r2           
   l1r7              exclude node 0000.0000.0007    Path not found 
                     exclude Ethernet4              l1r10          
   l1r4              exclude Ethernet3              l1r9           
                     exclude node 0000.0000.0004    Path not found 
   l1r5              exclude Ethernet2              l1r7           
                     exclude node 0000.0000.0002    l1r7           


================================================================================

command = show isis ti-lfa tunnel

Tunnel Index 2
   via 10.0.5.2, 'Ethernet4'
      label stack 3
   backup via 10.0.3.2, 'Ethernet3'
      label stack 3
Tunnel Index 4
   via 10.0.3.2, 'Ethernet3'
      label stack 3
   backup via 10.0.5.2, 'Ethernet4'
      label stack 3
Tunnel Index 6
   via 10.0.3.2, 'Ethernet3'
      label stack 3
   backup via 10.0.5.2, 'Ethernet4'
      label stack 800009 800004
Tunnel Index 7
   via 10.0.5.2, 'Ethernet4'
      label stack 3
   backup via 10.0.3.2, 'Ethernet3'
      label stack 800010 800007
Tunnel Index 8
   via 10.0.2.1, 'Ethernet2'
      label stack 3
   backup via 10.0.5.2, 'Ethernet4'
      label stack 800006 800002
Tunnel Index 9
   via 10.0.5.2, 'Ethernet4'
      label stack 3
   backup via 10.0.2.1, 'Ethernet2'
      label stack 3
Tunnel Index 10
   via 10.0.2.1, 'Ethernet2'
      label stack 3
   backup via 10.0.5.2, 'Ethernet4'
      label stack 3

================================================================================

command = show isis segment-routing tunnel

 Index    Endpoint         Nexthop      Interface     Labels       TI-LFA       
                                                                   tunnel index 
-------- --------------- ------------ ------------- -------------- ------------ 
 1        10.0.0.1/32      10.0.1.1     Ethernet1     [ 3 ]        -            
 2        10.0.0.2/32      10.0.2.1     Ethernet2     [ 3 ]        8            
 3        10.0.0.7/32      10.0.5.2     Ethernet4     [ 3 ]        7            
 4        10.0.0.4/32      10.0.3.2     Ethernet3     [ 3 ]        6            
 5        10.0.0.9/32      10.0.5.2     Ethernet4     [ 800009 ]   2            
 6        10.0.0.10/32     10.0.3.2     Ethernet3     [ 800010 ]   4            
 7        10.0.0.11/32     10.0.5.2     Ethernet4     [ 800011 ]   2            
 8        10.0.0.8/32      10.0.5.2     Ethernet4     [ 800008 ]   2            
 9        10.0.0.6/32      10.0.5.2     Ethernet4     [ 800006 ]   9            
 10       10.0.0.5/32      10.0.2.1     Ethernet2     [ 800005 ]   10           


================================================================================

command = show tunnel fib


Type 'IS-IS SR', index 1, endpoint 10.0.0.1/32, forwarding None
   via 10.0.1.1, 'Ethernet1' label 3

Type 'IS-IS SR', index 2, endpoint 10.0.0.2/32, forwarding None
   via TI-LFA tunnel index 8 label 3
      via 10.0.2.1, 'Ethernet2' label 3
      backup via 10.0.5.2, 'Ethernet4' label 800006 800002

Type 'IS-IS SR', index 3, endpoint 10.0.0.7/32, forwarding None
   via TI-LFA tunnel index 7 label 3
      via 10.0.5.2, 'Ethernet4' label 3
      backup via 10.0.3.2, 'Ethernet3' label 800010 800007

Type 'IS-IS SR', index 4, endpoint 10.0.0.4/32, forwarding None
   via TI-LFA tunnel index 6 label 3
      via 10.0.3.2, 'Ethernet3' label 3
      backup via 10.0.5.2, 'Ethernet4' label 800009 800004

Type 'IS-IS SR', index 5, endpoint 10.0.0.9/32, forwarding None
   via TI-LFA tunnel index 2 label 800009
      via 10.0.5.2, 'Ethernet4' label 3
      backup via 10.0.3.2, 'Ethernet3' label 3

Type 'IS-IS SR', index 6, endpoint 10.0.0.10/32, forwarding None
   via TI-LFA tunnel index 4 label 800010
      via 10.0.3.2, 'Ethernet3' label 3
      backup via 10.0.5.2, 'Ethernet4' label 3

Type 'IS-IS SR', index 7, endpoint 10.0.0.11/32, forwarding None
   via TI-LFA tunnel index 2 label 800011
      via 10.0.5.2, 'Ethernet4' label 3
      backup via 10.0.3.2, 'Ethernet3' label 3

Type 'IS-IS SR', index 8, endpoint 10.0.0.8/32, forwarding None
   via TI-LFA tunnel index 2 label 800008
      via 10.0.5.2, 'Ethernet4' label 3
      backup via 10.0.3.2, 'Ethernet3' label 3

Type 'IS-IS SR', index 9, endpoint 10.0.0.6/32, forwarding None
   via TI-LFA tunnel index 9 label 800006
      via 10.0.5.2, 'Ethernet4' label 3
      backup via 10.0.2.1, 'Ethernet2' label 3

Type 'IS-IS SR', index 10, endpoint 10.0.0.5/32, forwarding None
   via TI-LFA tunnel index 10 label 800005
      via 10.0.2.1, 'Ethernet2' label 3
      backup via 10.0.5.2, 'Ethernet4' label 3

Type 'TI-LFA', index 2, forwarding None
   via 10.0.5.2, 'Ethernet4' label 3
   backup via 10.0.3.2, 'Ethernet3' label 3

Type 'TI-LFA', index 4, forwarding None
   via 10.0.3.2, 'Ethernet3' label 3
   backup via 10.0.5.2, 'Ethernet4' label 3

Type 'TI-LFA', index 6, forwarding None
   via 10.0.3.2, 'Ethernet3' label 3
   backup via 10.0.5.2, 'Ethernet4' label 800009 800004

Type 'TI-LFA', index 7, forwarding None
   via 10.0.5.2, 'Ethernet4' label 3
   backup via 10.0.3.2, 'Ethernet3' label 800010 800007

Type 'TI-LFA', index 8, forwarding None
   via 10.0.2.1, 'Ethernet2' label 3
   backup via 10.0.5.2, 'Ethernet4' label 800006 800002

Type 'TI-LFA', index 9, forwarding None
   via 10.0.5.2, 'Ethernet4' label 3
   backup via 10.0.2.1, 'Ethernet2' label 3

Type 'TI-LFA', index 10, forwarding None
   via 10.0.2.1, 'Ethernet2' label 3
   backup via 10.0.5.2, 'Ethernet4' label 3

================================================================================

command = show mpls lfib route

MPLS forwarding table (Label [metric] Vias) - 14 routes 
MPLS next-hop resolution allow default route: False
Via Type Codes:
          M - MPLS via, P - Pseudowire via,
          I - IP lookup via, V - VLAN via,
          VA - EVPN VLAN aware via, ES - EVPN ethernet segment via,
          VF - EVPN VLAN flood via, AF - EVPN VLAN aware flood via,
          NG - Nexthop group via
Source Codes:
          G - gRIBI, S - Static MPLS route,
          B2 - BGP L2 EVPN, B3 - BGP L3 VPN,
          R - RSVP, LP - LDP pseudowire,
          L - LDP, M - MLDP,
          IP - IS-IS SR prefix segment, IA - IS-IS SR adjacency segment,
          IL - IS-IS SR segment to LDP, LI - LDP to IS-IS SR segment,
          BL - BGP LU, ST - SR TE policy,
          DE - Debug LFIB

 IA  100000   [1]
                via M, 10.0.1.1, pop
                 payload autoDecide, ttlMode uniform, apply egress-acl
                 interface Ethernet1
 IA  100001   [1]
                via TI-LFA tunnel index 8, pop
                 payload autoDecide, ttlMode uniform, apply egress-acl
                    via 10.0.2.1, Ethernet2, label imp-null(3)
                    backup via 10.0.5.2, Ethernet4, label 800006 800002
 IA  100002   [1]
                via TI-LFA tunnel index 7, pop
                 payload autoDecide, ttlMode uniform, apply egress-acl
                    via 10.0.5.2, Ethernet4, label imp-null(3)
                    backup via 10.0.3.2, Ethernet3, label 800010 800007
 IA  100003   [1]
                via TI-LFA tunnel index 6, pop
                 payload autoDecide, ttlMode uniform, apply egress-acl
                    via 10.0.3.2, Ethernet3, label imp-null(3)
                    backup via 10.0.5.2, Ethernet4, label 800009 800004
 IP  800001   [1], 10.0.0.1/32
                via M, 10.0.1.1, pop
                 payload autoDecide, ttlMode uniform, apply egress-acl
                 interface Ethernet1
 IP  800002   [1], 10.0.0.2/32
                via TI-LFA tunnel index 8, pop
                 payload autoDecide, ttlMode uniform, apply egress-acl
                    via 10.0.2.1, Ethernet2, label imp-null(3)
                    backup via 10.0.5.2, Ethernet4, label 800006 800002
 IP  800004   [1], 10.0.0.4/32
                via TI-LFA tunnel index 6, pop
                 payload autoDecide, ttlMode uniform, apply egress-acl
                    via 10.0.3.2, Ethernet3, label imp-null(3)
                    backup via 10.0.5.2, Ethernet4, label 800009 800004
 IP  800005   [1], 10.0.0.5/32
                via TI-LFA tunnel index 10, swap 800005 
                 payload autoDecide, ttlMode uniform, apply egress-acl
                    via 10.0.2.1, Ethernet2, label imp-null(3)
                    backup via 10.0.5.2, Ethernet4, label imp-null(3)
 IP  800006   [1], 10.0.0.6/32
                via TI-LFA tunnel index 9, swap 800006 
                 payload autoDecide, ttlMode uniform, apply egress-acl
                    via 10.0.5.2, Ethernet4, label imp-null(3)
                    backup via 10.0.2.1, Ethernet2, label imp-null(3)
 IP  800007   [1], 10.0.0.7/32
                via TI-LFA tunnel index 7, pop
                 payload autoDecide, ttlMode uniform, apply egress-acl
                    via 10.0.5.2, Ethernet4, label imp-null(3)
                    backup via 10.0.3.2, Ethernet3, label 800010 800007
 IP  800008   [1], 10.0.0.8/32
                via TI-LFA tunnel index 2, swap 800008 
                 payload autoDecide, ttlMode uniform, apply egress-acl
                    via 10.0.5.2, Ethernet4, label imp-null(3)
                    backup via 10.0.3.2, Ethernet3, label imp-null(3)
 IP  800009   [1], 10.0.0.9/32
                via TI-LFA tunnel index 2, swap 800009 
                 payload autoDecide, ttlMode uniform, apply egress-acl
                    via 10.0.5.2, Ethernet4, label imp-null(3)
                    backup via 10.0.3.2, Ethernet3, label imp-null(3)
 IP  800010   [1], 10.0.0.10/32
                via TI-LFA tunnel index 4, swap 800010 
                 payload autoDecide, ttlMode uniform, apply egress-acl
                    via 10.0.3.2, Ethernet3, label imp-null(3)
                    backup via 10.0.5.2, Ethernet4, label imp-null(3)
 IP  800011   [1], 10.0.0.11/32
                via TI-LFA tunnel index 2, swap 800011 
                 payload autoDecide, ttlMode uniform, apply egress-acl
                    via 10.0.5.2, Ethernet4, label imp-null(3)
                    backup via 10.0.3.2, Ethernet3, label imp-null(3)

================================================================================

command = show ip route


VRF: default
Codes: C - connected, S - static, K - kernel, 
       O - OSPF, IA - OSPF inter area, E1 - OSPF external type 1,
       E2 - OSPF external type 2, N1 - OSPF NSSA external type 1,
       N2 - OSPF NSSA external type2, B - BGP, B I - iBGP, B E - eBGP,
       R - RIP, I L1 - IS-IS level 1, I L2 - IS-IS level 2,
       O3 - OSPFv3, A B - BGP Aggregate, A O - OSPF Summary,
       NG - Nexthop Group Static Route, V - VXLAN Control Service,
       DH - DHCP client installed default route, M - Martian,
       DP - Dynamic Policy Route, L - VRF Leaked,
       RC - Route Cache Route

Gateway of last resort is not set

 I L2     10.0.0.1/32 [115/11] via 10.0.1.1, Ethernet1
 I L2     10.0.0.2/32 [115/11] via 10.0.2.1, Ethernet2
 C        10.0.0.3/32 is directly connected, Loopback1
 I L2     10.0.0.4/32 [115/11] via 10.0.3.2, Ethernet3
 I L2     10.0.0.5/32 [115/21] via 10.0.2.1, Ethernet2
 I L2     10.0.0.6/32 [115/21] via 10.0.5.2, Ethernet4
 I L2     10.0.0.7/32 [115/11] via 10.0.5.2, Ethernet4
 I L2     10.0.0.8/32 [115/31] via 10.0.5.2, Ethernet4
 I L2     10.0.0.9/32 [115/21] via 10.0.5.2, Ethernet4
 I L2     10.0.0.10/32 [115/21] via 10.0.3.2, Ethernet3
 I L2     10.0.0.11/32 [115/31] via 10.0.5.2, Ethernet4
 C        10.0.1.0/30 is directly connected, Ethernet1
 C        10.0.2.0/30 is directly connected, Ethernet2
 C        10.0.3.0/30 is directly connected, Ethernet3
 I L2     10.0.4.0/30 [115/20] via 10.0.2.1, Ethernet2
 C        10.0.5.0/30 is directly connected, Ethernet4
 I L2     10.0.6.0/30 [115/20] via 10.0.3.2, Ethernet3
 I L2     10.0.7.0/30 [115/30] via 10.0.2.1, Ethernet2
                               via 10.0.5.2, Ethernet4
 I L2     10.0.8.0/30 [115/20] via 10.0.5.2, Ethernet4
 I L2     10.0.9.0/30 [115/30] via 10.0.5.2, Ethernet4
 I L2     10.0.10.0/30 [115/20] via 10.0.5.2, Ethernet4
 I L2     10.0.11.0/30 [115/30] via 10.0.5.2, Ethernet4
 I L2     10.0.12.0/30 [115/30] via 10.0.3.2, Ethernet3
                                via 10.0.5.2, Ethernet4
 I L2     10.0.13.0/30 [115/30] via 10.0.5.2, Ethernet4


================================================================================

In l1r3 we can see:

  • show isis segment-routing prefix-segments: all prefix segments are under “node” protection (apart from itself – 10.0.0.3/32)
  • show isis segment-routing adjacency-segments: all adjacent segments are under “node” protection.
  • show isis interface: All isis enabled interfaces (apart from loopback1) have TI-LFA node protection enabled for ipv4.
  • show isis ti-lfa path: Here we can see link and node protection to all possible destinations in our ISIS domain (all P routers in our BGP-Free core). When node protection is not possible, link protection is calculated. The exception is l1r1 because it has only one link into the networks, so if that is lost, there is no backup at all.
  • show isis ti-lfa tunnel: This can be confusing. These are the TI-LFA tunnels, the first two lines refer to the path they are protecting. The last two lines are really the tunnel configuration. Another interesting thing here is the label stack for some backup tunnels (index 6, 7, 8). This a way to avoid a loop. The index is used in the next command.
  • show isis segment-routing tunnel: Here we see the current SR tunnels and the corresponding backup (index that refers to above command). Label [3] is the implicit null label. Paying attention to the endpoint “10.0.0.2/32” (as per fig2 below). We can see the primary path is via eth2. The backup is via tunnel index 8 (via eth4 – l1r7). If you check the path to “10.0.0.2/32 – 800002” from l1r7 (output after fig2) you can see it is pointing back to l1r3 and we would have a loop! For this reason the backup tunnel index 8 in l1r3 has a label stack to avoid this loop (800006 800002). Once l1r7 received this packet and checks the segment labels, it sends the packet to 800006 via eth2 (l1r6) and then l1r6 uses 8000002 to reach finally l1r2 (via l1r5).
Fig2. l1r3: backup tunnel for l1r2
l1r7# show isis segment-routing tunnel
Index Endpoint Nexthop Interface Labels TI-LFA
tunnel index

1 10.0.0.9/32 10.0.10.2 Ethernet3 [ 3 ] 3
2 10.0.0.6/32 10.0.8.1 Ethernet2 [ 3 ] 1
3 10.0.0.3/32 10.0.5.1 Ethernet1 [ 3 ] 2
4 10.0.0.10/32 10.0.10.2 Ethernet3 [ 800010 ] 7
5 10.0.0.11/32 10.0.10.2 Ethernet3 [ 800011 ] 4
6 10.0.0.4/32 10.0.5.1 Ethernet1 [ 800004 ] 11
7 10.0.0.8/32 10.0.8.1 Ethernet2 [ 800008 ] -
- 10.0.10.2 Ethernet3 [ 800008 ] -
8 10.0.0.2/32 10.0.5.1 Ethernet1 [ 800002 ] 9
9 10.0.0.5/32 10.0.8.1 Ethernet2 [ 800005 ] 8
10 10.0.0.1/32 10.0.5.1 Ethernet1 [ 800001 ] 10
l1r7#
l1r7#show mpls lfib route 800006
...
IP 800006 [1], 10.0.0.6/32
via TI-LFA tunnel index 1, pop
payload autoDecide, ttlMode uniform, apply egress-acl
via 10.0.8.1, Ethernet2, label imp-null(3)
backup via 10.0.10.2, Ethernet3, label 800008 800006
l1r7#
l1r7#show mpls lfib route 800002
...
IP 800002 [1], 10.0.0.2/32
via TI-LFA tunnel index 9, swap 800002
payload autoDecide, ttlMode uniform, apply egress-acl
via 10.0.5.1, Ethernet1, label imp-null(3)
backup via 10.0.8.1, Ethernet2, label imp-null(3)
  • show tunnel fib: you can see all “IS-IS SR” and “TI-LFA” tunnels defined. It is like a merge of “show isis segment-routing tunnel” and “show isis ti-lfa tunnel”.
  • show mpls lfib route: You can see the programmed labels and TI-LFA. I’ve got confused when I see “imp-null” and the I see some pop/swap for the same entry…
  • show ip route: nothing really interesting without L3VPNS

Testing

Ok, you need to generate traffic that is labelled to really test TI-LFA and with enough packet rate to see if you are close to the 50ms recovery promissed.

So I have had to make some changes:

  • create a L3VPN CUST-A (evpn) in l1r3 and l1r9, so they are PEs
  • l1r1 and l1r11 are CPE in VRF CUST-A

All other devices have no changes

We need to test with and without TI-LFA enabled. The test I have do is to ping from l1r1 to l1r11 and dropping the link l1r3-l1r7, while l1r3 has enabled/disabled TI-LFA.

Fig3 – Testing Scenario

Routing changes with TI-LFA enabled


BEFORE DROPPING LINK
======

l1r3#show ip route vrf CUST-A

 B I      10.0.13.0/30 [200/0] via 10.0.0.9/32, IS-IS SR tunnel index 5, label 116384
                                  via TI-LFA tunnel index 4, label 800009
                                     via 10.0.5.2, Ethernet4, label imp-null(3)
                                     backup via 10.0.3.2, Ethernet3, label imp-null(3)
 C        192.168.0.3/32 is directly connected, Loopback2
 B I      192.168.0.9/32 [200/0] via 10.0.0.9/32, IS-IS SR tunnel index 5, label 116384
                                    via TI-LFA tunnel index 4, label 800009
                                       via 10.0.5.2, Ethernet4, label imp-null(3)
                                       backup via 10.0.3.2, Ethernet3, label imp-null(3)

AFTER DROPPING LINK
======

l1r3#show ip route vrf CUST-A

 B I      10.0.13.0/30 [200/0] via 10.0.0.9/32, IS-IS SR tunnel index 5, label 116384
                                  via TI-LFA tunnel index 11, label 800009
                                     via 10.0.3.2, Ethernet3, label imp-null(3)
                                     backup via 10.0.2.1, Ethernet2, label 800005
 C        192.168.0.3/32 is directly connected, Loopback2
 B I      192.168.0.9/32 [200/0] via 10.0.0.9/32, IS-IS SR tunnel index 5, label 116384
                                    via TI-LFA tunnel index 11, label 800009
                                       via 10.0.3.2, Ethernet3, label imp-null(3)

Ping results

TI-LFA enabled in L1R3  TEST1
=========================

bash-4.2# ping -f 10.0.13.2
PING 10.0.13.2 (10.0.13.2) 56(84) bytes of data.
..................^C                                                                                                      
--- 10.0.13.2 ping statistics ---
1351 packets transmitted, 1333 received, 1% packet loss, time 21035ms
rtt min/avg/max/mdev = 21.081/348.764/1722.587/487.280 ms, pipe 109, ipg/ewma 15.582/67.643 ms
bash-4.2# 


NO TI-LFA enabled in L1R3  TEST1
=========================

bash-4.2# ping -f 10.0.13.2
PING 10.0.13.2 (10.0.13.2) 56(84) bytes of data.
.............................................E...................................................................................^C            
--- 10.0.13.2 ping statistics ---
2274 packets transmitted, 2172 received, +1 errors, 4% packet loss, time 36147ms
rtt min/avg/max/mdev = 20.965/88.300/542.279/86.227 ms, pipe 34, ipg/ewma 15.903/73.403 ms
bash-4.2# 

Summary Testing

With TI-LFA enabled in l1r3, we have lost 18 packets (around 280ms)

Without TI-LFA in l1r3, we have lost 102 packets (around 1621ms =~ 1.6s)

Keeping in mind this lab is based in VMs (veos) running in another VM (eve-ng) is not bad result.

It seems far from the 50ms, but still shows the improvement of enabling TI-LFA

Optical in Networking: 101

This is a very good presentation about optical stuff from NANOG 70 (2017). And I noticed there is an updated version from NANOG 77 (2019). I watched the 2017 (2h) and there is something always bites me: db vs dbm

A bit more info about dB vs dBm: here and here

Close to the end, there are some common questions about optical that he provides answers. I liked the ones about “looking at the lasers can make you blind” and the point that is worth cleaning your fibers. A bit about cleaning here.

Multicast + 5G

I have been cleaning up my email box and found some interesting stuff. This is from APNIC regarding a new approach to deploy multicast. Slides from nanog page (check tuesday) In my former employer, we suffered traffic congestion several times after some famous games got new updates. So it is interesting that Akamai is trying to deploy inter-domain multicast in the internet. They have a massive network and I guess they suffered with those updates and this is an attempt to “digest” better those spikes. At minute 16 you can see the network changes required. It doesnt look like a quick/easy change but would be a great thing to happen.

And reading a nanog conversation about 5G I realised that this technology promises high bandwidth (and that could fix the issue of requiring multicast). But still we should have a smarter way to deliver same content to eyeball networks?

From the nanog thread, there are several links to videos about 5G like this from verizon that gives the vision from a big provider and its providers (not technical). This one is more technical with 5G terms (I lost contact of Telco term with early 4G). As well, I see mentioning kubernetes in 5G deployments quite often. I guess something new to learn.

OOB

I was reading this blog and realised that OOB is something is not talked about very often. Based on what I have seen in my career:

Design

You need to sell the idea that this is a must. Then you need to secure some budget. You dont need much:

1x switch

1x firewall

1x Internet access (if you have your ASN and IP range, dont use it)

Keep it simple..

Most network kit (firewalls, routers, switches, pdus, console servers, etc) have 1xmgmt port and 1xconsole port. So all those need to go to the console server. I guess most server vendors offer some OOB access (I just know Dell and HP). So all those go to the oob switch.

If you have a massive network with hundreds of devices/servers, then you will need more oob switches and console servers. You still need just one firewall and 1 internet connection. The blog comments about the spine-leaf oob network. I guess this is the way for a massive network/DC.

Access to OOB

You need to be able to access it via your corporate network and from anywhere in the internet.

You need to be sure linux/windows/macs can VPN.

Use very strong passwords and keys.

You need to be sure the oob firewall is quite tight in access. At the end of the day you only want to allow ssh to the console server and https to the ILO/iDRACS. Nothing initiated internally can go to the internet.

Dependencies

Think in the worse scenario. Your DNS server is down. Your authentication is down.

You need to be sure you have local auth enabled in all devices for emergency

You need to work out some DNS service. Write the key IPs in the documentation?

You IP transit has to be reliable. You dont need a massive pipe but you need to be sure it is up.

Monitoring

You dont want to be in the middle of the outage and realise that your OOB is not functional. You need to be sure the ISP for the OOB is up and the devices (oob switch and oob firewall) are functional all the time.

How to check the serial connections? conserver.com

Documentation

Another point frequently lost. You need to be sure people can find info about the OOB: how is built and how to access it.

Training

At the end of the day, if you have a super OOB network but then nobody knows how to connect and use it, then it is useful. Schedule routine checkups with the team to be sure everybody can OOB. This is useful when you get a call at 3am.

Diagram

Update

Funny enough, I was watching today NLNOG live and there was a presentation about OOB with too different approaches: in-band out-of-band and pure out-of-band.

From the NTT side, I liked the comment about conserver.com to manage your serial connections. I will try to use it once I have access to a new network.

Forward TCPDump to Wireshark

Reading this blog entry I realised that very likely I have never tried forward tcpdump to a wireshark. How many times I have taken a pcap in a switch and then I need to download to see the details in wireshark…

I guess you can find some blocking points in firewalls (at least for 2-steps option)

So I tried the single command with a switch in my ceoslab and it works!

Why it works?

ssh <username>@<switch>  "bash tcpdump -s 0 -U -n -w - -i <interface>" | wireshark -k -i -

The ssh command is actually executing the “bash tcpdump…” remotely. But the key is the “-U” and “-w -” flags. “-U” in conjunction with “-w” sends the packet without waiting for the buffer to fill. Then “-w -” says that it writes the output to stdout instead of a file. If you run the command without -U, it would work but it will update a bit slower as it needs to fill the buffers.

From tcpdump manual:

       -U
       --packet-buffered
              If the -w option is not specified, make the printed packet output ``packet-buffered''; i.e., as the description of the contents of each packet is printed, it will be written to the standard  output,  rather  than, when not writing to a terminal, being written only when the output buffer fills.

              If  the  -w  option  is  specified, make the saved raw packet output ``packet-buffered''; i.e., as each packet is saved, it will be written to the output file, rather than being written only when the output buffer fills.

              The -U flag will not be supported if tcpdump was built with an older version of libpcap that lacks the pcap_dump_flush() function.

......
   -w file
          Write the raw packets to file rather than parsing and printing them out.  They can later be printed with the -r option.  Standard output is used if file is ``-''.

          This  output will be buffered if written to a file or pipe, so a program reading from the file or pipe may not see packets for an arbitrary amount of time after they are received.  Use the -U flag to cause packets to be written as soon as they are received.

And the stdout of that process is the ssh command so we redirect that outout via a pipe “|” and it is sent as input for wireshark thanks to “-i -” that makes wireshark to read from stdin (that is the stdout from the tcpdump in the switch!)

The wireshark manual:

       -i|--interface  <capture interface>|-
           Set the name of the network interface or pipe to use for live packet capture.

           Network interface names should match one of the names listed in "wireshark -D" (described above); a number, as reported by "wireshark -D", can also be used.  If you're using UNIX, "netstat -i", "ifconfig -a" or "ip link" might also work to list interface names, although not all versions of UNIX support the -a flag to ifconfig.

           If no interface is specified, Wireshark searches the list of interfaces, choosing the first non-loopback interface if there are any non-loopback interfaces, and choosing the first loopback interface if there are no non-loopback interfaces.  If there are no interfaces at all, Wireshark reports an error and doesn't start the capture.

           Pipe names should be either the name of a FIFO (named pipe) or "-" to read data from the standard input.  On Windows systems, pipe names must be of the form "\\pipe\.\pipename".  Data read from pipes must be in standard pcapng or pcap format. Pcapng data must have the same endianness as the capturing host.

           This option can occur multiple times. When capturing from multiple interfaces, the capture file will be saved in pcapng format.

....

       -k  Start the capture session immediately.  If the -i flag was specified, the capture uses the specified interface.  Otherwise, Wireshark searches the list of interfaces, choosing the first non-loopback interface if
           there are any non-loopback interfaces, and choosing the first loopback interface if there are no non-loopback interfaces; if there are no interfaces, Wireshark reports an error and doesn't start the capture.

The two-steps option relies on “nc” to send/receive the data, but it is the same idea regarding the tcpdump/wireshark flags using “-“

On switch: tcpdump -s 0 -U -n -w - -i <interface> | nc <computer-ip> <port>
On PC: netcat -l -p <port> | wireshark -k -S -i -

Linux Networking – Bonding/Bridging/VxLAN

Bonding

$ sudo modprobe bonding
$ ip link help bond
$ sudo ip link add bond0 type bond mode 802.3ad
$ sudo ip link set eth0 master bond0
$ sudo ip link set eth1 master bond0

Bridging: vlans + trunks

ip neigh show // l2 table
ip route show // l3 table

ip route add default via 192.168.1.1 dev eth1

sudo modprobe 8021q

// create bridge and add links to bridge (switch)
sudo ip link add br0 type bridge vlan_filtering 1 // native vlan = 1
sudo ip link set eth1 master br0
sudo ip link set eth2 master br0
sudo ip link set eth3 master br0

// make eth1 access port for v11
sudo bridge vlan add dev eth1 vid 11 pvid untagged

// make eth3 access port for v12
sudo bridge vlan add dev eth3 vid 12 pvid untagged

// make eth2 trunk port for v11 and v12
sudo bridge vlan add dev eth2 vid 11
sudo bridge vlan add dev eth2 vid 12

// enable bridge and links
sudo ip link set up dev br0
sudo ip link set up dev eth1
sudo ip link set up dev eth2
sudo ip link set up dev eth3

bridge link show
bridge vlan show
bridge fdb show

VxLAN

I havent tried this yet:

Linux System 1
  sudo ip link add br0 type bridge vlan_filtering 1
  sudo ip link add vlan10 type vlan id 10 link bridge protocol none
  sudo ip addr add 10.0.0.1/24 dev vlan10
  sudo ip link add vtep10 type vxlan id 1010 local 10.1.0.1 remote 10.3.0.1 learning
  sudo ip link set eth1 master br0
  sudo bridge vlan add dev eth1 vid 10 pvid untagged

Linux System 2
  sudo ip link add br0 type bridge vlan_filtering 1
  sudo ip link add vlan10 type vlan id 10 link bridge protocol none
  sudo ip addr add 10.0.0.2/24 dev vlan10
  sudo ip link add vtep10 type vxlan id 1010 local 10.3.0.1 remote 10.1.0.1 learning
  sudo ip link set eth1 master br0
  sudo bridge vlan add dev eth1 vid 10 pvid untagged

Traceroute

A good refresh about traceroute. It is a very common tool for network troubleshooting so it is important to use it wisely

Important points

  • ICMP vs UDP: most implementations do UDP (it can be blocked…)
  • Every probe is an independent trial!
  • Try to identify the characteristics and location of each hop
  • If there is a congestion/delay issue in one hop, it has to be carried out to the next hops, if not, it is just prioritizing of the ICMP generation by that router/hop.
  • You dont see the reverse path – Ask the other end (if possible) to send the traceroute from its end.
  • Border routers between providers can be a hot spot for issues.
  • Asymmetric paths can bite you. Try to set the source address in your tests (from the provider IP, from your own space, etc)
  • Spot ECMP (in the same hop, you see several different IPs). Multiple unequal length paths can be painful.
  • MPLS: most times is hidden (TTL is removed). It can be tricky to spot. But it can be funny when you see the hops (with private IPs 🙂

And if you are more interested in the paths than latency, this can be a good too:

https://github.com/rucarrol/traceflow

Openconfig

I have struggled to get something working for learning a bit of openconfig.

At the end, all info my info comes from Anton’s Karneliuk blog series about Openconfig. So all credit to him. It is the best source about real testing of openconfig in different platforms.

I am not going to create the wheel and explain what openconfig is. In my head, it is attempt from several big vendors to standardise the network management (config) and monitoring (telemetry) via YANG (vendor-neutral) models. So OC uses YANG. And we interact with OC using a transport protocol like netconf, restcong and gNMI. So the network devices need to implement one of these protocols. Based on the blog Cisco, Nokia and Arista have netconf implementations and Ansible has a module for that!!! So the key words are openconfig, yang and netcong.

So in my case, based on my ceos lab, I have added a new playbook based on Anton’s to test openconfig/netcong with Arista cEOS:

https://github.com/thomarite/ceos-testing/blob/master/README.md#openconfig

This is quite basic as it only gets the interface config.

Following Anton’s Part3 Blog:

I tried to push config via openconfig to my ceos devices (all files are in github as per link above).

The blog is dense but it is good because there is a lot of info. In this case, you have to use an Ansible role so it is a new thing to learn. As well, I wanted to adapt that role to my env and found some ansible issues but managed to fix after reading ansible documentation and paying attention to the -vvv info.

From “oc-push-config.yaml” the first task “collect” it is fine. It just takes some 10 minutes or more to get all YANG modules from each device.

The issue is with task “configure”. It fails when trying to push the interface config. I have tried Anton’s config and the actual config generated from oc-get-interface-info.yaml but no joy.

Based on the blog, it seems Arista doesnt have much interest in Openconfig.

Anyway, there have been a couple of intense days looking at all this openconfig/netcong/yang thing. I have just touched the surface but I have learned some Ansible in the way too. So could be worse.

Will move on to something else.

Netbox – API Troubleshooting

Yesterday managed to get netbox and my lab connected. So today followed up with the original article, and found a new issue that took me several hours.

Initially I was seeing an error that I couldn’t undestand “

netbox.exceptions.CreateException: This field is required

From

(venv) /netbox-example/nornir-napalm-netbox-demo master$ python scripts/create_interfaces.py
nb_url = http://0.0.0.0:8080
Creating Netbox Interface for device r1, interface Loopback1
Traceback (most recent call last):
File "scripts/create_interfaces.py", line 42, in
task=create_netbox_interface,
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/init.py", line 146, in run
result = self._run_serial(task, run_on, **kwargs)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/init.py", line 72, in _run_serial
result[host.name] = task.copy().start(host, self)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/task.py", line 85, in start
r = self.task(self, **self.params)
File "scripts/create_interfaces.py", line 34, in create_netbox_interface
device_id=device_id,
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/netbox/dcim.py", line 431, in create_interface
return self.netbox_con.post('/dcim/interfaces/', required_fields, **kwargs)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/netbox/connection.py", line 124, in post
raise exceptions.CreateException(resp_data)
netbox.exceptions.CreateException: This field is required.

So I started to follow the trace, adding “print” and using “ipdb” to see what was going on:

....
/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/netbox/connection.py(71)__request()
70 finally:
---> 71 self.close()
72
ipdb> dir(response)
['attrs', 'bool', 'class', 'delattr', 'dict', 'dir', 'doc', 'enter', 'eq', 'exit', 'format', 'ge', 'getattribute', 'getstate', 'gt', 'hash', 'init', 'init_subclass', 'iter', 'le', 'lt', 'module', 'ne', 'new', 'nonzero', 'reduce', 'reduce_ex', 'repr', 'setattr', 'setstate', 'sizeof', 'str', 'subclasshook', 'weakref', '_content', '_content_consumed', '_next', 'apparent_encoding', 'close', 'connection', 'content', 'cookies', 'elapsed', 'encoding', 'headers', 'history', 'is_permanent_redirect', 'is_redirect', 'iter_content', 'iter_lines', 'json', 'links', 'next', 'ok', 'raise_for_status', 'raw', 'reason', 'request', 'status_code', 'text', 'url']
ipdb> response.url
'http://0.0.0.0:8080/api/dcim/interfaces/'
ipdb> response.text
'{"type":["This field is required."]}'
ipdb> response.status_code
400
ipdb> response.content
b'{"type":["This field is required."]}'
ipdb> response.reason
'Bad Request'
ipdb> response.request

ipdb> prepared_request

ipdb> prepared_request.url
'http://0.0.0.0:8080/api/dcim/interfaces/'
ipdb> dir(prepared_request)
['class', 'delattr', 'dict', 'dir', 'doc', 'eq', 'format', 'ge', 'getattribute', 'gt', 'hash', 'init', 'init_subclass', 'le', 'lt', 'module', 'ne', 'new', 'reduce', 'reduce_ex', 'repr', 'setattr', 'sizeof', 'str', 'subclasshook', 'weakref', '_body_position', '_cookies', '_encode_files', '_encode_params', '_get_idna_encoded_host', 'body', 'copy', 'deregister_hook', 'headers', 'hooks', 'method', 'path_url', 'prepare', 'prepare_auth', 'prepare_body', 'prepare_content_length', 'prepare_cookies', 'prepare_headers', 'prepare_hooks', 'prepare_method', 'prepare_url', 'register_hook', 'url']
ipdb> prepared_request.path_url
'/api/dcim/interfaces/'
ipdb> response.__content
*** AttributeError: 'Response' object has no attribute '__content'
ipdb> response._content
b'{"type":["This field is required."]}'
ipdb> response.content
b'{"type":["This field is required."]}'
ipdb> response.headers
{'Server': 'nginx', 'Date': 'Wed, 08 Jul 2020 12:36:35 GMT', 'Content-Type': 'application/json', 'Content-Length': '36', 'Connection': 'keep-alive', 'Vary': 'Accept, Cookie, Origin', 'Allow': 'GET, POST, HEAD, OPTIONS, TRACE', 'API-Version': '2.8', 'X-Content-Type-Options': 'nosniff', 'X-Frame-Options': 'SAMEORIGIN'}
ipdb> response.reason
'Bad Request'
ipdb> response.request

ipdb> response.test
*** AttributeError: 'Response' object has no attribute 'test'
ipdb> response.text
'{"type":["This field is required."]}'
ipdb> response.url
'http://0.0.0.0:8080/api/dcim/interfaces/'
ipdb> quit
Create Netbox Interfaces
r1 ** changed : False
vvvv Create Netbox Interfaces ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv ERROR
---- napalm_get ** changed : False --------------------------------------------- INFO
(venv) go:1.12.5|py:3.7.3|tomas@athens:~/storage/technology/netbox-example/nornir-napalm-netbox-demo master$ python scripts/create_interfaces.py
nb_url = http://0.0.0.0:8080
url3=http://0.0.0.0:8080/api/dcim/interfaces?limit=0
Creating Netbox Interface for device r1, interface Loopback1
url3=http://0.0.0.0:8080/api/dcim/devices/?name=r1&limit=0
device_id = 1
url3=http://0.0.0.0:8080/api/dcim/interfaces/
resp_ok=False resp_status=400
body_data= {'name': 'Loopback1', 'form_factor': 1200, 'device': 1}
params= /dcim/interfaces/
resp_data= {'type': ['This field is required.']}
Traceback (most recent call last):
File "scripts/create_interfaces.py", line 43, in
task=create_netbox_interface,
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/init.py", line 146, in run
result = self._run_serial(task, run_on, **kwargs)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/init.py", line 72, in _run_serial
result[host.name] = task.copy().start(host, self)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/task.py", line 85, in start
r = self.task(self, **self.params)
File "scripts/create_interfaces.py", line 35, in create_netbox_interface
device_id=device_id,
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/netbox/dcim.py", line 431, in create_interface
return self.netbox_con.post('/dcim/interfaces/', required_fields, **kwargs)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/netbox/connection.py", line 130, in post
raise exceptions.CreateException(resp_data)
netbox.exceptions.CreateException: This field is required.

So it seems that at the end I realised that I was missing the parameter “type” !!!

I was checking the documentation from netbox in github but I couldnt see clearly what kind of config I had to provide…

I checked the “type” value for the only interfaces I already had in netbox: “http://0.0.0.0:8080/api/dcim/interfaces/

So I tried to pass exactly that but it was still failing…

(venv) go:1.12.5|py:3.7.3|tomas@athens:~/storage/technology/netbox-example/nornir-napalm-netbox-demo master$ python scripts/create_interfaces.py
nb_url = http://0.0.0.0:8080
url3=http://0.0.0.0:8080/api/dcim/interfaces?limit=0
Creating Netbox Interface for device r1, interface Loopback1
url3=http://0.0.0.0:8080/api/dcim/devices/?name=r1&limit=0
device_id = 1
url3=http://0.0.0.0:8080/api/dcim/interfaces/
resp_ok=False resp_status=400
body_data= {'name': 'Loopback1', 'form_factor': 1200, 'device': 1, 'type': {'value': '1000base-t', 'label': '1000BASE-T (1GE)', 'id': 1000}}
params= /dcim/interfaces/
resp_data= {'type': ['Value must be passed directly (e.g. "foo": 123); do not use a dictionary or list.']}
Traceback (most recent call last):
File "scripts/create_interfaces.py", line 50, in
task=create_netbox_interface,
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/init.py", line 146, in run
result = self._run_serial(task, run_on, **kwargs)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/init.py", line 72, in _run_serial
result[host.name] = task.copy().start(host, self)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/task.py", line 85, in start
r = self.task(self, **self.params)
File "scripts/create_interfaces.py", line 42, in create_netbox_interface
**interface_type,
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/netbox/dcim.py", line 431, in create_interface
return self.netbox_con.post('/dcim/interfaces/', required_fields, **kwargs)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/netbox/connection.py", line 130, in post
raise exceptions.CreateException(resp_data)
netbox.exceptions.CreateException: Value must be passed directly (e.g. "foo": 123); do not use a dictionary or list.
(venv) go:1.12.5|py:3.7.3|tomas@athens:~/storage/technology/netbox-example/nornir-napalm-netbox-demo master$

Somehow the API had to be documented… by chance, looking at the bottom of the netbox page, there was an”API” link….

So, now I needed to look up the correct API call. Based on the script and logs, it was a “POST” for “/dcim/interfaces/”. Here we go!

So finally, I had the info. I confirmed what fields were mandatory and the value they needed!

interface_type = {}
interface_type["type"] = "1000base-t"
for interface_name in interfaces.keys():
    if not is_interface_present(nb_interfaces, f"{task.host}", interface_name):
        print(
            f"* Creating Netbox Interface for device {task.host}, interface {interface_name}"
        )
        device_id = get_device_id(f"{task.host}", netbox)
        print("device_id = %s" % device_id)
        netbox.dcim.create_interface(
           name=f"{interface_name}",
           form_factor=1200,  # default
           device_id=device_id,
           **interface_type,
        )

So the script ran fine for all my devices:

netbox-example/nornir-napalm-netbox-demo master$ python scripts/create_interfaces.py
nb_url = http://0.0.0.0:8080
url3=http://0.0.0.0:8080/api/dcim/interfaces?limit=0
Create Netbox Interfaces
r1 ** changed : False
vvvv Create Netbox Interfaces ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
---- napalm_get ** changed : False --------------------------------------------- INFO
^^^^ END Create Netbox Interfaces ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
r2 ** changed : False
vvvv Create Netbox Interfaces ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
---- napalm_get ** changed : False --------------------------------------------- INFO
^^^^ END Create Netbox Interfaces ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
r3 ** changed : False
vvvv Create Netbox Interfaces ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
---- napalm_get ** changed : False --------------------------------------------- INFO
^^^^ END Create Netbox Interfaces ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

And it is updated in GUI:

Netbox

Another thing I wanted to play with is Netbox and found a good article to follow. So credits to the authors.

Using my current ceos lab from https://github.com/thomarite/ceos-testing

I followed Rick’s article to install netbox-docker and his own repo with the nornir examples using netbox. In this case nornir is going to use netbox as inventory. Normally I use local files. I created a python venv for 3.7.3

mkdir netbox-example; cd netbox-example
pyenv local 3.7.3
python -m virtualenv venv
source venv/bin/activate
git clone https://github.com/netbox-community/netbox-docker.git
cd netbox-docker
vim docker-compose.yml  --> so it always expose 8080
      nginx:
      ...
        ports:
          - 8080:8080
docker-compose pull
docker-compose up

When installing the requirements for “nornir-napalm-netbox-demo” I had to modify the version of some packages. So I removed the required version and I left pip to install the latest. I didnt use the makefile.

git clone https://github.com/rickdonato/nornir-napalm-netbox-demo
cd nornir-napalm-netbox-demo
python -m pip install -r requirements.txt

I struggled quite a bit with the management IP in netbox and the meaning of “platform”

  • Create Manufacturers under Device Types: I created “Arista”
  • Create Device Types under Device Types: I created “ceos”
  • Create Platforms under Devices: This is VERY important as it has to be a supported NAPALM platform!!! So for Arista, I need “eos”.
  • Create Device Roles under Devices. I created “pe”
  • Create Devices under Devices.
  • Within each device: add a management interface. Here, I got confused as I was adding the interface in the inventory section. The inventory section is info to/from the device using NAPALM. So you need to go to the bottom of the page, add the interface
  • and then add an IP to that interface and mark it as primary.

Keep in mind that initially, I was using “0.0.0.0” for each device as that’s the IP I have be using for all my scripts lately.

Keep in mind (II) that we are using docker twice (from different commands…) one to get netbox and the other via docker(-topo) to get the Arista ceos containers…. and we have iptables rules under the hood created by both…

But, let’s go step by step. Now we need to confirm that our nornir scrip can connect to netbox. So follow “Nornir-to-Netbox Configuration” section. This is my file. I updated the nb_url and nb_token. Notice the usage of “transform_function“.

---
core:
num_workers: 20
inventory:
plugin: nornir.plugins.inventory.netbox.NBInventory
options:
nb_url: 'http://0.0.0.0:8080'
nb_token: '<NETBOX_API_TOKEN>'
ssl_verify: False
transform_function: "helpers.adapt_user_password"

You need to update “scripts/secrets.py” with the devices you have in your inventory and the user/pass:

creds = {
"r1": {"username": "user", "password": "pas123"},
"r2": {"username": "user", "password": "pas123"},
"r3": {"username": "user", "password": "pas123"},
}

So now we can test if nornir can connect to netbox:

/netbox-example/nornir-napalm-netbox-demo master$ python scripts/helpers.py --inventory
{'defaults': {'connection_options': {},
'data': {},
'hostname': None,
'password': None,
'platform': None,
'port': None,
'username': None},
'groups': {},
'hosts': {'r1': {'connection_options': {},
'data': {'asset_tag': 'r1',
'model': 'ceos',
'role': 'pe',
'serial': 'r1',
'site': 'lab1',
'vendor': 'Arista'},
'groups': [],
'hostname': '192.168.16.2',
'password': 'pas123',
'platform': 'eos',
'port': None,
'username': 'user'},
'r2': {'connection_options': {},
'data': {'asset_tag': 'r2',
'model': 'ceos',
'role': 'pe',
'serial': 'r2',
'site': 'lab1',
'vendor': 'Arista'},
'groups': [],
'hostname': '192.168.16.3',
'password': 'pas123',
'platform': 'eos',
'port': None,
'username': 'user'},
'r3': {'connection_options': {},
'data': {'asset_tag': 'r3',
'model': 'ceos',
'role': 'pe',
'serial': 'r3',
'site': 'lab1',
'vendor': 'Arista'},
'groups': [],
'hostname': '192.168.16.4',
'password': 'pas123',
'platform': 'eos',
'port': None,
'username': 'user'}}}

All good. Let’s see if we can get backups from the devices.

netbox-example/nornir-napalm-netbox-demo master$ python scripts/backup_configs.py
Backup Device configurations**
r1 ** changed : True *
vvvv Backup Device configurations ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
---- napalm_get ** changed : False --------------------------------------------- INFO
---- write_file ** changed : True ---------------------------------------------- INFO
^^^^ END Backup Device configurations ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
r2 ** changed : True *
vvvv Backup Device configurations ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
---- napalm_get ** changed : False --------------------------------------------- INFO
---- write_file ** changed : True ---------------------------------------------- INFO
^^^^ END Backup Device configurations ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
r3 ** changed : True *
vvvv Backup Device configurations ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
---- napalm_get ** changed : False --------------------------------------------- INFO
---- write_file ** changed : True ---------------------------------------------- INFO
^^^^ END Backup Device configurations ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(venv) /netbox-example/nornir-napalm-netbox-demo master$

Great all good.

Now, let’s see netbox using NAPALM. If you click on “Status” for any device, netbox will use NAPALM to get the facts from the device. If netbox is not configured properly with NAPALM, it will fail. This is a working scenario:

The tabs “LLDP neighbors” and “Configuration” relay too in NAPALM.

So for configuring netbox with napalm you need to tell netbox the user/pass that NAPALM needs:

netbox-example$ vim netbox-docker/env/netbox.env
...
NAPALM_USERNAME=user
NAPALM_PASSWORD=pas123
NAPALM_TIMEOUT=10
...

Very likely you will have to restart netbox:

/netbox-docker release$ docker-compose down
/netbox-docker release$ docker-compose up

As mentioned before, I had an issue when I was using “0.0.0.0” as IP. By default (as It seems I can’t think) I was using the exposed IP/port from docker-topo to reach the ceos switches. I haven’t had an issue until using netbox.

I am using docker for netbox and docker(-topo) for my arista cEOS switches. So the connectivity between netbox and ceos is via the IPs/interfaces/bridges created by docker. And remember… you have iptables under the hood. My first mistake was telling netbox to use 0.0.0.0 as it is the one I am using to testing from my scripts when connecting to ceos. Netbox needs to point to the IP assigned by docker :facepalm: 192.16.16.x in my case. Second one, the port, same thing docker exposes the port 443 as 900x for external connections and I use 900x in my scripts. From netbox point of view, it is still 443 :facepalm: And finally, I am calling docker twice for building my lab, one for netbox-docker and the other for ceos. You need to keep an eye on iptables changes when restarting netbox via docker-compose because you can be in the situation that netbox traffic is dropped in DOCKER-ISOLATION-STAGE-1 :facepalm: (need to try to write a docker-compose to build everything in one go)

So when I was having errors from netbox that it was being rejected when connecting to ceos devices via NAPALM, I couldnt understand it. My scripts were fine using those details (0.0.0.0:900x)

I ran tcpdump on one ceos on the “ethernet0” interface and NOTHING was hitting the interface from netbox on port 900x but my scripts could…..

Somehow netbox wasnt able to reach ceos r1??? In my head, netbox and ceos devices were all in 0.0.0.0….. so no routing, no firewalls, they are connected in the same network 0.0.0.0…..

At the end I waked up and realised that the docker devices are using the IPs provided by docker so it is following normal routing… and firewalling by iptables. The same for ceos devices, they have IPs (different from 0.0.0.0)

So I updated netbox with the correct management IPs for r1, r2 and r3 ceos.

When I filtered by the real netbox IP in r1 tcpdump ethernet0, I was seeing traffic on 900x!!! Good. Then I realised that it has to be 443. So I removed my hack to update the port to 900x.

For a different reason I had to restart docker-topo (for ceos) and then docker netbox. And now, I coudnt see any traffic from netbox in r1….. I “didn’t” change anything. So the routing didnt change, there was something else “cutting” the connection: iptables

docker uses iptables very heavily. I realised that after restart docker-netbox, iptables changed…

before restart:

# iptables -t filter -S DOCKER-ISOLATION-STAGE-1
Warning: iptables-legacy tables present, use iptables-legacy to see them
-N DOCKER-ISOLATION-STAGE-1
-A DOCKER-ISOLATION-STAGE-1 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i br-94a8183a4fb1 ! -o br-94a8183a4fb1 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-0d4ec9aba9bd ! -o br-0d4ec9aba9bd -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-609619313dc8 ! -o br-609619313dc8 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-61d32350cb58 ! -o br-61d32350cb58 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-384488acbc99 ! -o br-384488acbc99 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN

after restart:

# iptables -t filter -S DOCKER-ISOLATION-STAGE-1
Warning: iptables-legacy tables present, use iptables-legacy to see them
-N DOCKER-ISOLATION-STAGE-1
-A DOCKER-ISOLATION-STAGE-1 -i br-381cdff63d2f ! -o br-381cdff63d2f -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i br-94a8183a4fb1 ! -o br-94a8183a4fb1 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-0d4ec9aba9bd ! -o br-0d4ec9aba9bd -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-609619313dc8 ! -o br-609619313dc8 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-61d32350cb58 ! -o br-61d32350cb58 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN

With the restart, docker created a new bridge interface for netbox (old: br-384488acbc99, new: br-381cdff63d2f) and it wasnt hitting anymore the “DOCKER-ISOLATION-STAGE-1 -j ACCEPT”

So I had to make an iptables change:

# iptables -t filter -D DOCKER-ISOLATION-STAGE-1 -j ACCEPT
# iptables -t filter -I DOCKER-ISOLATION-STAGE-1 -j ACCEPT
# iptables -t filter -S DOCKER-ISOLATION-STAGE-1
Warning: iptables-legacy tables present, use iptables-legacy to see them
-N DOCKER-ISOLATION-STAGE-1
-A DOCKER-ISOLATION-STAGE-1 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i br-381cdff63d2f ! -o br-381cdff63d2f -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-94a8183a4fb1 ! -o br-94a8183a4fb1 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-0d4ec9aba9bd ! -o br-0d4ec9aba9bd -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-609619313dc8 ! -o br-609619313dc8 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-61d32350cb58 ! -o br-61d32350cb58 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
#

And finally, netbox could use napalm to contact the ceos devices…. Calling docker twice is not a great idea….

BTW, this is my docker ps with netbox and ceos devices:

(venv) /netbox-example$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c23be76ffd54 nginx:1.17-alpine "nginx -c /etc/netbo…" 4 hours ago Up 4 hours 80/tcp, 0.0.0.0:8080->8080/tcp netbox-docker_nginx_1
5a0b89f18578 netboxcommunity/netbox:latest "/opt/netbox/docker-…" 4 hours ago Up 4 hours netbox-docker_netbox_1
528948de329b netboxcommunity/netbox:latest "python3 /opt/netbox…" 4 hours ago Up 4 hours netbox-docker_netbox-worker_1
29529302ba1c redis:5-alpine "docker-entrypoint.s…" 4 hours ago Up 4 hours 6379/tcp netbox-docker_redis_1
5e975ec2aa70 redis:5-alpine "docker-entrypoint.s…" 4 hours ago Up 4 hours 6379/tcp netbox-docker_redis-cache_1
6158672a4ae6 postgres:11-alpine "docker-entrypoint.s…" 4 hours ago Up 4 hours 5432/tcp netbox-docker_postgres_1
34841aa098d4 ceos-lab:4.23.3M "/sbin/init systemd.…" 5 hours ago Up 5 hours 0.0.0.0:2002->22/tcp, 0.0.0.0:9002->443/tcp 3node_r03
4ca92c6a3b09 ceos-lab:4.23.3M "/sbin/init systemd.…" 5 hours ago Up 5 hours 0.0.0.0:2001->22/tcp, 0.0.0.0:9001->443/tcp 3node_r02
67e8b7ab84e0 ceos-lab:4.23.3M "/sbin/init systemd.…" 5 hours ago Up 5 hours 0.0.0.0:2000->22/tcp, 0.0.0.0:9000->443/tcp 3node_r01

I was painful but I learned a couple of things about netbox, nornir and docker/iptables!!!