Routing Down on Tier-0 Service Router

Reading Time: 8 minutes

Routing Down on Tier-0 Service Router is an article that explains what happens if a Tier-0 Service Router (SR) loses its routing adjacencies.

As we know, in the NSX world we have two gateway types:

  • T0 or Tier-0
  • T1 or Tier-1

Both gateways are hosted in the Edge Transport Node device!
Each one is responsible for some activities in the NSX topology and has different characteristics.
We wrote an article that shows some examples of NSX topologies and the meaning of each gateway type and its interfaces. You can access this article by clicking on the following link:
Some Examples of NSX Topologies – DPC Virtual Tips

T0 Service Router (SR)

The Tier-0 Service Router (SR) is responsible for handling the north-south traffic. To do that, uplink interfaces are created and configured in this service router. Commonly, we use dynamic routing protocols to establish adjacencies within the physical routers to provide north-south communication.

In the following picture, we have a lab topology just to explain the concept to you:

  • We have both physical routers (router01 and router02). Both are placed outside the NSX environment (outside the virtual “world”);
  • We have an edge cluster with four (4) edge transport nodes (edge01, edge02, edge03, and edge04);
  • We have a vSAN cluster with three (3) ESXi hosts, already prepared to work as a host transport node.

As we can see in the above picture, the “transit segment” is used by the transport node devices (for their internal communication) and an address in the network range 169.254.X.X is assigned for each one. All host transport nodes shared the same address (169.254.0.1), as we can see in the following output. This interface is the LIF (Logical Interface):

HOST-01:

vhost-comp-01.lab.local# nsxcli
vhost-comp-01.lab.local> get gateways
Wed Jan 31 2024 UTC 13:42:21.023
                                      Gateways Summary
 ------------------------------------------------------------------------------------------
               VDR UUID                LIF num   IPv4 Route num   IPv6 Route num  Max Neighbors  Current Neighbors
 c08ecb08-e21e-4d4b-9163-f66d2553bec7     4             7                5            50000              16


vhost-comp-01.lab.local> get gateway c08ecb08-e21e-4d4b-9163-f66d2553bec7 interface
Wed Jan 31 2024 UTC 13:42:23.007
                            Gateway Interfaces
---------------------------------------------------------------------------
IPv6 DAD Status Legend:  [A: DAD_Sucess], [F: DAD_Duplicate], [T: DAD_Tentative], [U: DAD_Unavailable]

LIF UUID                 : ba201756-d373-469c-a309-ee313a4eb9bc
Mode                     : [b'Routing-Backplane']
Overlay VNI              : 73728
IP/Mask                  : 169.254.0.1/28;  fe80::50:56ff:fe56:4452/128(U)
Mac                      : 02:50:56:56:44:52
Connected DVS            : VDS-Computing
Control plane enable     : True
Replication Mode         : 0.0.0.1
Multicast Routing        : [b'Enabled', b'Oper Down']
State                    : [b'Enabled']
Flags                    : 0x90308
DHCP relay               : Not enable
DAD-mode                 : ['LOOSE']
RA-mode                  : ['SLAAC_DNS_THROUGH_RA(M=0, O=0)']
...
...

HOST-02:

vhost-comp-02.lab.local# nsxcli
vhost-comp-02.lab.local> get gateways
Wed Jan 31 2024 UTC 13:44:56.056
                                      Gateways Summary
 ------------------------------------------------------------------------------------------
               VDR UUID                LIF num   IPv4 Route num   IPv6 Route num  Max Neighbors  Current Neighbors
 c08ecb08-e21e-4d4b-9163-f66d2553bec7     4             7                5            50000              11

vhost-comp-02.lab.local> get gateway c08ecb08-e21e-4d4b-9163-f66d2553bec7 interface
Wed Jan 31 2024 UTC 13:44:58.424
                            Gateway Interfaces
---------------------------------------------------------------------------
IPv6 DAD Status Legend:  [A: DAD_Sucess], [F: DAD_Duplicate], [T: DAD_Tentative], [U: DAD_Unavailable]

LIF UUID                 : ba201756-d373-469c-a309-ee313a4eb9bc
Mode                     : [b'Routing-Backplane']
Overlay VNI              : 73728
IP/Mask                  : 169.254.0.1/28;  fe80::50:56ff:fe56:4452/128(U)
Mac                      : 02:50:56:56:44:52
Connected DVS            : VDS-Computing
Control plane enable     : True
Replication Mode         : 0.0.0.1
Multicast Routing        : [b'Enabled', b'Oper Down']
State                    : [b'Enabled']
Flags                    : 0x90308
DHCP relay               : Not enable
DAD-mode                 : ['LOOSE']
RA-mode                  : ['SLAAC_DNS_THROUGH_RA(M=0, O=0)']
...
...

HOST-03:

vhost-comp-03.lab.local# nsxcli
vhost-comp-03.lab.local> get gateways
Wed Jan 31 2024 UTC 13:46:14.946
                                      Gateways Summary
 ------------------------------------------------------------------------------------------
               VDR UUID                LIF num   IPv4 Route num   IPv6 Route num  Max Neighbors  Current Neighbors
 c08ecb08-e21e-4d4b-9163-f66d2553bec7     4             7                5            50000              11


vhost-comp-03.lab.local> get gateway c08ecb08-e21e-4d4b-9163-f66d2553bec7 interface
Wed Jan 31 2024 UTC 13:46:20.098
                            Gateway Interfaces
---------------------------------------------------------------------------
IPv6 DAD Status Legend:  [A: DAD_Sucess], [F: DAD_Duplicate], [T: DAD_Tentative], [U: DAD_Unavailable]

LIF UUID                 : ba201756-d373-469c-a309-ee313a4eb9bc
Mode                     : [b'Routing-Backplane']
Overlay VNI              : 73728
IP/Mask                  : 169.254.0.1/28;  fe80::50:56ff:fe56:4452/128(U)
Mac                      : 02:50:56:56:44:52
Connected DVS            : VDS-Computing
Control plane enable     : True
Replication Mode         : 0.0.0.1
Multicast Routing        : [b'Enabled', b'Oper Down']
State                    : [b'Enabled']
Flags                    : 0x90308
DHCP relay               : Not enable
DAD-mode                 : ['LOOSE']
RA-mode                  : ['SLAAC_DNS_THROUGH_RA(M=0, O=0)']
...
...

Each edge node has an interface using the address on this network (backplane port), as we can see in the following outputs:

EDGE-01:

    Interface     : 6289a5dc-65c2-40a2-b6bc-53f3bd7b4770
    Ifuid         : 275
    Name          : bp-sr0-port
    Fwd-mode      : IPV4_ONLY
    Internal name : downlink-275
    Mode          : lif
    Port-type     : backplane
    IP/Mask       : 169.254.0.2/25;fe80::50:56ff:fe56:5300/64(NA)
    MAC           : 02:50:56:56:53:00
    VNI           : 66561
    Access-VLAN   : untagged
    Segment port  : 1fb264c0-ecd6-4dcc-9856-0f759ec65e62
    Urpf-mode     : NONE
    DAD-mode      : LOOSE
    RA-mode       : RA_INVALID
    Admin         : up
    Op_state      : up
    MTU           : 1500
    arp_proxy     :

EDGE-02:

    Interface     : 1207a97d-691a-4db7-b1e0-c9097efb30c8
    Ifuid         : 275
    Name          : bp-sr1-port
    Fwd-mode      : IPV4_ONLY
    Internal name : downlink-275
    Mode          : lif
    Port-type     : backplane
    IP/Mask       : 169.254.0.3/25;fe80::50:56ff:fe56:5301/64(NA)
    MAC           : 02:50:56:56:53:01
    VNI           : 66561
    Access-VLAN   : untagged
    Segment port  : df2e0cc7-606c-4100-91e3-ed32ba081d78
    Urpf-mode     : NONE
    DAD-mode      : LOOSE
    RA-mode       : RA_INVALID
    Admin         : up
    Op_state      : up
    MTU           : 1500
    arp_proxy     :

EDGE-03:

    Interface     : 3aad399a-bb37-4d82-81d5-6111b4802863
    Ifuid         : 285
    Name          : bp-sr2-port
    Fwd-mode      : IPV4_ONLY
    Internal name : downlink-285
    Mode          : lif
    Port-type     : backplane
    IP/Mask       : 169.254.0.4/25;fe80::50:56ff:fe56:5302/64(NA)
    MAC           : 02:50:56:56:53:02
    VNI           : 66561
    Access-VLAN   : untagged
    Segment port  : 61a1d1b8-3d43-4e01-80e2-1918213b055f
    Urpf-mode     : NONE
    DAD-mode      : LOOSE
    RA-mode       : RA_INVALID
    Admin         : up
    Op_state      : up
    MTU           : 1500
    arp_proxy     :

EDGE-04:

    Interface     : 08b7faef-3aab-421d-9018-a011c7ed1bad
    Ifuid         : 299
    Name          : bp-sr3-port
    Fwd-mode      : IPV4_ONLY
    Internal name : downlink-299
    Mode          : lif
    Port-type     : backplane
    IP/Mask       : 169.254.0.5/25;fe80::50:56ff:fe56:5303/64(NA)
    MAC           : 02:50:56:56:53:03
    VNI           : 66561
    Access-VLAN   : untagged
    Segment port  : fda7ed2e-dcd1-4e04-8950-dfab9b7f3500
    Urpf-mode     : NONE
    DAD-mode      : LOOSE
    RA-mode       : RA_INVALID
    Admin         : up
    Op_state      : up
    MTU           : 1500
    arp_proxy     :

Additionally, we can access the service router and confirm its BGP neighbors.
For edge-01, for instance, we have both physical routers as neighbors (IPs 10.0.22.1 and 10.0.23.1):

edge-01> get logical-routers
Wed Jan 31 2024 UTC 14:03:32.358
Logical Router
UUID                                   VRF    LR-ID  Name                              Type                        Ports   Neighbors
736a80e3-23f6-5a2d-81d6-bbefb2786666   0      0                                        TUNNEL                      3       3/5000
20169f10-8099-4d27-8915-8e1e0d47e68e   1      2050   DR-T0-GW                          DISTRIBUTED_ROUTER_TIER0    5       1/50000
c9e7c979-0d69-4aab-bd6c-9ee863c96070   2      3074   SR-T0-GW                          SERVICE_ROUTER_TIER0        7       5/50000

edge-01> vrf 2

edge-01(tier0_sr[2])> get bgp neighbor summary
BFD States: NC - Not configured, DC - Disconnected
            AD - Admin down, DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv4Unicast
Router ID: 10.0.22.11  Local AS: 65000

Neighbor                            AS          State Up/DownTime  BFD InMsgs  OutMsgs InPfx  OutPfx

10.0.22.1                           65100       Estab 02:16:15     UP  1594    2004    6      6
10.0.23.1                           65100       Estab 02:14:50     UP  1594    2002    6      12

The same occurs for other edge nodes.
Look that we are using the BFD (Bidirectional Forwarding Detection) on both BGP peers. The protocol BFD helps us to identify if the peer is alive or not.

Simulating a Routing Failure on Edge-01

We will simulate a routing failure on the service router hosted on the edge transport node 01 (edge-01).
To simulate a routing failure, we will do:

  • On each physical router, we will shut down (administratively) the BGP peer to the edge-01;
  • And we will disable the BFD configuration on both physical routers as well.

What is the expected behavior?
The LIF (Logical Interface) on edge-01 will be moved temporarily to another edge transport node!

The picture below shows the failure scenario (in this example, the LIF interface from edge-01 is moving to edge-02. But, other edge nodes can be selected to receive this interface – this is an automatic process):

Checking the BGP peers on each physical router. Both peers 10.0.22.11 and 10.0.23.11 are to the edge-01 (from VLAN 22 and VLAN 23):

PHYSICAL ROUTER-01:

vyos@vyos-a:~$ show ip bgp summary

IPv4 Unicast Summary (VRF default):
BGP router identifier 192.168.255.8, local AS number 65100 vrf-id 0
BGP table version 35
RIB entries 21, using 4032 bytes of memory
Peers 7, using 143 KiB of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
10.0.20.2       4      65100      1706      1706       35    0    0 1d04h13m           10       10 N/A
10.0.21.2       4      65100      1706      1706       35    0    0 1d04h13m           10       10 N/A
10.0.22.11      4      65000      1591      1605       35    0    0 02:27:53            6       12 N/A
10.0.22.12      4      65000      1654      1665       35    0    0 1d03h22m            6       12 N/A
10.0.22.13      4      65000      1655      1662       35    0    0 1d03h21m            6       12 N/A
10.0.22.14      4      65000      1651      1662       35    0    0 1d03h21m            6       12 N/A
10.0.200.1      4      65100      1698      1707       35    0    0 1d04h14m            1       10 N/A

PHYSICAL ROUTER-02:

vyos@vyos-b:~$ show ip bgp summary

IPv4 Unicast Summary (VRF default):
BGP router identifier 192.168.255.9, local AS number 65100 vrf-id 0
BGP table version 35
RIB entries 21, using 4032 bytes of memory
Peers 7, using 143 KiB of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
10.0.20.1       4      65100      1707      1707       35    0    0 1d04h14m           10       10 N/A
10.0.21.1       4      65100      1707      1707       35    0    0 1d04h14m           10       10 N/A
10.0.23.11      4      65000      1606      1608       35    0    0 02:27:47            6       12 N/A
10.0.23.12      4      65000      1659      1666       35    0    0 1d03h23m            6       12 N/A
10.0.23.13      4      65000      1653      1665       35    0    0 1d03h23m            6       12 N/A
10.0.23.14      4      65000      1657      1665       35    0    0 1d03h23m            6       12 N/A
10.0.201.1      4      65100      1698      1707       35    0    0 1d04h14m            1       10 N/A

Disabling the BGP peers for the edge-01 and removing the BFD configuration as well.
Look the BGP peer is “Idle (Admin)”. It means that we disabled administratively this BGP peer:

PHYSICAL ROUTER-01:

set protocols bgp neighbor 10.0.22.11 shutdown
delete protocols bgp neighbor 10.0.22.11 bfd

vyos@vyos-a:~$ show ip bgp summary

IPv4 Unicast Summary (VRF default):
BGP router identifier 192.168.255.8, local AS number 65100 vrf-id 0
BGP table version 35
RIB entries 21, using 4032 bytes of memory
Peers 7, using 143 KiB of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
10.0.20.2       4      65100      1711      1711       35    0    0 1d04h18m           10       10 N/A
10.0.21.2       4      65100      1711      1711       35    0    0 1d04h18m           10       10 N/A
10.0.22.11      4      65000      1596      1610        0    0    0 00:00:10 Idle (Admin)        0 N/A
10.0.22.12      4      65000      1659      1670       35    0    0 1d03h27m            6       12 N/A
10.0.22.13      4      65000      1661      1668       35    0    0 1d03h27m            6       12 N/A
10.0.22.14      4      65000      1657      1668       35    0    0 1d03h27m            6       12 N/A
10.0.200.1      4      65100      1703      1712       35    0    0 1d04h19m            1       10 N/A

PHYSICAL ROUTER-02:

set protocols bgp neighbor 10.0.23.11 shutdown
delete protocols bgp neighbor 10.0.23.11 bfd

vyos@vyos-b:~$ show ip bgp summary

IPv4 Unicast Summary (VRF default):
BGP router identifier 192.168.255.9, local AS number 65100 vrf-id 0
BGP table version 35
RIB entries 21, using 4032 bytes of memory
Peers 7, using 143 KiB of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
10.0.20.1       4      65100      1713      1713       35    0    0 1d04h20m           10       10 N/A
10.0.21.1       4      65100      1713      1713       35    0    0 1d04h20m           10       10 N/A
10.0.23.11      4      65000      1613      1614        0    0    0 00:00:08 Idle (Admin)        0 N/A
10.0.23.12      4      65000      1665      1672       35    0    0 1d03h29m            6       12 N/A
10.0.23.13      4      65000      1659      1671       35    0    0 1d03h29m            6       12 N/A
10.0.23.14      4      65000      1663      1671       35    0    0 1d03h29m            6       12 N/A
10.0.201.1      4      65100      1704      1713       35    0    0 1d04h20m            1       10 N/A

Checking the BGP peer status on edge-01. Both peers are “Active” and not “Established”:

edge-01(tier0_sr[2])> get bgp neighbor summary
BFD States: NC - Not configured, DC - Disconnected
            AD - Admin down, DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv4Unicast
Router ID: 10.0.22.11  Local AS: 65000

Neighbor                            AS          State Up/DownTime  BFD InMsgs  OutMsgs InPfx  OutPfx

10.0.22.1                           65100       Activ 00:03:46     DC  1610    2043    0      0
10.0.23.1                           65100       Activ 00:01:39     DC  1613    2032    0      0

Now, accessing each edge transport node and seeing the backplane interfaces, we can see (in this case) that the LIF IP address has been assigned to the edge-03, as we can confirm in the following output:

    Interface     : 3aad399a-bb37-4d82-81d5-6111b4802863
    Ifuid         : 285
    Name          : bp-sr2-port
    Fwd-mode      : IPV4_ONLY
    Internal name : downlink-285
    Mode          : lif
    Port-type     : backplane
    IP/Mask       : 169.254.0.2/25;169.254.0.4/25;fe80::50:56ff:fe56:5300/64(NA);fe80::50:56ff:fe56:5302/64(NA)
    MAC           : 02:50:56:56:53:02
    VNI           : 66561
    Access-VLAN   : untagged
    Segment port  : 61a1d1b8-3d43-4e01-80e2-1918213b055f
    Urpf-mode     : NONE
    DAD-mode      : LOOSE
    RA-mode       : RA_INVALID
    Admin         : up
    Op_state      : up
    MTU           : 1500
    arp_proxy     :

With the above output, we confirmed that the LIF was moved to another edge transport node and the forwarding and routing processes can continue through this IP 169.254.0.2!

Removing the Routing Failure on Edge-01

As the routing peers on edge-01 can be recovered, the LIF IP address will move back to edge-01 immediately:

edge-01> get gateways
Wed Jan 31 2024 UTC 14:36:51.000
Gateway
UUID                                   VRF    Gateway-ID   Name                              Type                        Ports   Neighbors
736a80e3-23f6-5a2d-81d6-bbefb2786666   0      0                                              TUNNEL                      3       3/5000
20169f10-8099-4d27-8915-8e1e0d47e68e   1      2050         DR-T0-GW                          DISTRIBUTED_ROUTER_TIER0    5       1/50000
c9e7c979-0d69-4aab-bd6c-9ee863c96070   2      3074         SR-T0-GW                          SERVICE_ROUTER_TIER0        7       5/50000


edge-01> get gateway 20169f10-8099-4d27-8915-8e1e0d47e68e interfaces

...
...
Interface     : 6289a5dc-65c2-40a2-b6bc-53f3bd7b4770
    Ifuid         : 275
    Name          : bp-sr0-port
    Fwd-mode      : IPV4_ONLY
    Internal name : downlink-275
    Mode          : lif
    Port-type     : backplane
    IP/Mask       : 169.254.0.2/25;fe80::50:56ff:fe56:5300/64(NA)
    MAC           : 02:50:56:56:53:00
    VNI           : 66561
    Access-VLAN   : untagged
    Segment port  : 1fb264c0-ecd6-4dcc-9856-0f759ec65e62
    Urpf-mode     : NONE
    DAD-mode      : LOOSE
    RA-mode       : RA_INVALID
    Admin         : up
    Op_state      : up
    MTU           : 1500
    arp_proxy     :

...
...

To Wrapping this Up

In this simple example, we can see what happens with the TO DR LIF interface when we have a routing issue. The LIF IP address is moved to another edge transport node just to avoid a massive update on all devices’ routing tables and it is an intelligent way to maintain the traffic flowing by this interface without interruption (or with minimal interruption).