EVE-NG: Arista Lab

As my last attempt to build a MPLS-SR Arista lab failed usin cEOS. I decided to try a different approach as I need more resources that my laptop has. For sometime, I wanted to use tesuto but I am not sure if it is still on business. From the main page, you can’t find any link to register (and pay) for the service. Although if you search for “pricing” you can find a link to that. That’s it.

The other option was to use EVE-NG. You can use it in your own bare-metal server or in the cloud.

So finally, I decided to spend some money. I signed up for GCP with a $300 free computing offer. So at least I dont pay for GCP yet and then I bought one year of EVE-NG professional. Let’s see how it goes.

Before buying the license, you need to install eve-ng. So I followed the official documentation to use it in GCP as it is quite up to date.

I consulted other links too just to compare other users experiences like these:

https://github.com/NetDevNotes/Eve-NG-in-Google-Cloud

I had an issue during the process. When I had to configure DHCP, the IP wizard was showing garbage in the script. Hopefully I didnt have to add anything just accept all default values.

So once it is done, you need to https to the VM…. it didnt work. Somehow “apache” was started. So after startup, got access. I can login and change the default password.

root@eveng01:/var/www/html# service apache2 start
root@eveng01:/var/www/html# service apache2 status

So far, I am not planning to give it a static IP to the VM and a FQDN from my domain. Maybe in the future if I use it often.

Now, I need to create the Arista lab. I followed one of the links earlier, it was quite handy.

I created my small 3 nodes lab, apply the config. All this with a couple of reboots in each device and you have the lab up and running!

It is nice to work in a system with plenty of RAM. The VM has 60GB of RAM and 16vCPU. So I should be able to create a lab with 14 vEOS (each one needs 4GB and 1CPU).

$ top
top - 13:00:27 up 1:33, 1 user, load average: 2.12, 1.37, 1.04
Tasks: 266 total, 1 running, 168 sleeping, 0 stopped, 0 zombie
%Cpu(s): 10.3 us, 5.9 sy, 0.0 ni, 83.4 id, 0.0 wa, 0.0 hi, 0.4 si, 0.0 st
KiB Mem : 10.2/61838576 [ ]
KiB Swap: 0.0/0 [ ]
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27623 root 20 0 3034100 1.992g 25696 S 100.4 3.4 11:21.40 qemu-system-x86
26120 root 20 0 3034100 1.951g 26068 S 100.0 3.3 8:54.66 qemu-system-x86
24536 root 20 0 3034100 1.915g 26072 S 43.3 3.2 9:16.11 qemu-system-x86
245 root 25 5 0 0 0 S 8.2 0.0 2:05.36 uksmd
7500 www-data 20 0 377908 30744 12732 S 4.5 0.0 0:17.27 apache2
4262 root 20 0 1138416 15732 13508 S 0.8 0.0 0:25.40 janus
5526 tomcat8 20 0 5925452 348168 17676 S 0.8 0.6 0:43.17 java
159 root 20 0 0 0 0 I 0.4 0.0 0:01.13 kworker/6:1-eve
4363 mysql 20 0 2493932 85712 20408 S 0.4 0.1 0:10.80 mysqld
7210 www-data 20 0 377900 31024 12724 S 0.4 0.1 0:07.08 apache2

Unfortunately, I am hitting the same problem, and this time, the MAC addresses are the ones you expect to see based on the interface outputs:

I have asked again Arista if this is expected…

In the main time, I need to learn how to map the devices in the VM to external ports so I can access directly from my laptop.

UPDATE

My Arista SE confirmed that cEOS doesnt support MPLS Data Plane. And this should work with vEOS. So I asked in Arista forum about this problem with vEOS and turns out that this works but you need to be sure that a “physical” interface is attached to the VRF, a Loopback or SVI is not enough.

This seems to be the original post about the problem:

https://eos.arista.com/forum/see-bgp-routes-unable-to-ping/

So I just added a VPC to et3 in each device in CUST-A VRF and I can ping across VRFs!!!

r4#ping vrf CUST-A 192.168.0.2 source 192.168.0.1
PING 192.168.0.2 (192.168.0.2) from 192.168.0.1 : 72(100) bytes of data.
80 bytes from 192.168.0.2: icmp_seq=1 ttl=65 time=70.9 ms
80 bytes from 192.168.0.2: icmp_seq=2 ttl=65 time=64.3 ms
80 bytes from 192.168.0.2: icmp_seq=3 ttl=65 time=58.2 ms
80 bytes from 192.168.0.2: icmp_seq=4 ttl=65 time=50.6 ms
80 bytes from 192.168.0.2: icmp_seq=5 ttl=65 time=58.6 ms
--- 192.168.0.2 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 47ms
rtt min/avg/max/mdev = 50.613/60.554/70.943/6.786 ms, pipe 5, ipg/ewma 11.817/65.414 ms
r4#

And the funny thing. I can’t see anymore the MPLS packets in the tcpdump 🙂

Anyway, good news, I can carry on creating more complex labs and test some scripting/automation stuff.

This is the latest diagram:

FTP Passive

I have a supplier at my employer that requires to use a FTP server to send big files when you open a support ticket. For a long time (a couple of years) whenever I had to upload big files, I had to use my personal VM because my ftp connections failed from the office. I always blamed the super-smart firewall.

One day, I decided to fix the issue and allow the connection in our corporate firewall. I failed. Still couldnt upload files from the office. So keep using my personal VM.

This week I had to upload again a big file. This time I am working from home, so pretty much it is going to work the upload. Wrong! It fails. Ok, I checked a bit and got to the conclusion that it is my ISP or modem at home that is blocking FTP. Most ISP use CGN to stretch as much as possible the limited IPv4. I have IPv6 at home and my VM has IPv6 too… but the ftp server doesnt.

I checked the internet if there was any know issue with my ISP and FTP connections. No luck. I connected to my modem, nothing obvious messing around with FTP.

I decided to give it a proper go to this issue. I knew that it worked from my VM and it didnt from home. I noticed that I was running the same ftp client version in the VM and at home. So let’s debug the ftp client and take a packet capture in both locations.

CLI from the VM:

$ ftp -vd b.b.b.b
ftp: setsockopt: Bad file descriptor
Name: ftp
---> USER ftp
331 Please specify the password.
Password:
---> PASS XXXX
230 Login successful.
---> SYST
215 UNIX Type: L8
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> cd support
---> CWD support
250 Directory successfully changed.
ftp> cd 211211
---> CWD 211211
250 Directory successfully changed.
ftp> put TEST.txt
local: TEST.txt remote: TEST.txt
---> TYPE I
200 Switching to Binary mode.
ftp: setsockopt (ignored): Permission denied
---> PORT a,a,a,a,162,57
200 PORT command successful. Consider using PASV.
---> STOR TEST.txt
150 Ok to send data.
226 Transfer complete.
28 bytes sent in 0.00 secs (854.4922 kB/s)
ftp> quit
---> QUIT

And this is the packet capture:

After typing “put” in packet 33, I see a “PASV” message from the server and a new connection (initiated by the server!) is established for the data transfer. All good.

So now, make the same from home and compare.

CLI from home without debug:

$ ftp b.b.b.b
Connected to b.b.b.b.
Name: ftp
331 Please specify the password.
Password:
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> cd support
250 Directory successfully changed.
ftp> cd 211211
250 Directory successfully changed.
ftp> put TEST.txt
local: TEST.txt remote: TEST.txt
500 Illegal PORT command.
ftp: bind: Address already in use
ftp> quit
221 Goodbye.

CLI from home with debug:

$ ftp -vd b.b.b.b
ftp: setsockopt: Bad file descriptor
Name: ftp
---> USER ftp
331 Please specify the password.
Password:
---> PASS XXXX
230 Login successful.
---> SYST
215 UNIX Type: L8
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> cd support
---> CWD support
250 Directory successfully changed.
ftp> cd 211211
---> CWD 211211
250 Directory successfully changed.
ftp> put TEST.txt
local: TEST.txt remote: TEST.txt
---> TYPE I
200 Switching to Binary mode.
ftp: setsockopt (ignored): Permission denied
---> PORT 192,168,1,158,202,145
500 Illegal PORT command.
ftp: bind: Address already in use
ftp> quit
---> QUIT
221 Goodbye.

So with and without debug I keep seeing “ftp: bind: Address already in use”…..

And this is the packet capture from home:

So after I type “put” in packet 32, the answer from the server is a “500”.

I wasnt clearly paying attention to the clues. I was still banging my head why the server was sending a “500 Ilegal PORT command”.

I was comparing both captures and both debug outputs… but still didnt it.

I thought I understood FTP. I knew that you use port TCP 21 to establish the control session and the data session / transfer is via new TCP session using a random port. That’s one of the reasons that using NAT or CGN can screw up your FTP sessions.

So I assumed that the issues wasnt my ISP. So it had to be my side (or me).

So finally, I decided to search for “ftp: bind: Address already in use” as it was the message that came up with and without debugging.

Oh boy, first entry in the face!

https://www.linuxquestions.org/questions/linux-distributions-5/problems-with-ftp-server-bind-address-allready-in-use-213509/

An entry from 2004…. it can’t fix my problem for sure…. keep reading and update from 2020… it says it works…. oh boy II

try using a passive connection with "ftp -p" instead, see if it helps...

There we go:

$ ftp -vdp b.b.b.b
ftp: setsockopt: Bad file descriptor
Name: ftp
---> USER ftp
331 Please specify the password.
Password:
---> PASS XXXX
230 Login successful.
---> SYST
215 UNIX Type: L8
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> cd support
---> CWD support
250 Directory successfully changed.
ftp> cd 211211
---> CWD 211211
250 Directory successfully changed.
ftp> put TEST.txt
local: TEST.txt remote: TEST.txt
---> TYPE I
200 Switching to Binary mode.
ftp: setsockopt (ignored): Permission denied
---> PASV
227 Entering Passive Mode (b,b,b,b,46,248).
---> STOR TEST.txt
150 Ok to send data.
226 Transfer complete.
26 bytes sent in 0.00 secs (12.5386 kB/s)
ftp> quit
---> QUIT
221 Goodbye.

it worked !!!

I felt embarrassed. Time to search for FTP passive vs active…

Really good explanation. I hope I will never forget it.

  • FTP Active: The client issues a PORT command to the server signalling that it will “actively” provide an IP and port number so the server opens the Data Connection back to the client.
  • FTP Passive: The client issues a PASV command to indicate that it will wait “passively” for the server to supply an IP and port number, after which the client opens a Data Connection to the server.

So it worked in my VM because somehow the ftp server sent a PASV command (maybe because it detects there is no NAT as I have a public IP???).

From home, it failed because, by default, the connection is ftp active, so when the server tried to open the new data connection to me(something I couldnt see in the packet capture…) it failed as my ADSL modem wouldnt allow inbound connections.

Once I enabled “-p” in my connection to the server, all worked because it was me who started the new data connection and my firewall allows everything outbound.

Happy to solve the problem after a couple of years, and after a couple of hours of “serious” troubleshooting. It was shocking how blind I was. I had the ftp error message and the PASV from the trace.

Anyway, I learned something new.

BGP-Free Core

This week I have been following a discussion in NANOG about LDPv6 (there are lot of emails but it is VERY interesting) and I realized that I didnt recognize the term “BGP-Free Core”. So I searched about it. It seems it wasnt an obscure subject and funny enough I have used that design in my MPLS labs in GNS3… So what is BGP-Free core? These are the links I read:

https://blog.ipspace.net/2012/01/bgp-free-service-provider-core-in.html

And this is my favourite.

As in my basic MPLS lab, we only use BGP between PEs, and the P router only does IGP and LDP, it doesnt have to know anything about VRFs.

So for that reason, you need to increase the MTU in your links (4bytes per MPLS label) and link usage increases for the extra overhead.

So it is important to know stuff but as well how to name that stuff 😛

Indistractable

Just finished reading this book. I wanted to follow up with more info about how to improve my concentration and attention after “Deep Work”. This book is more dense. I liked the first part as there was a strong focus in the person’s psychology for distraction. We have our internal and external triggers that push us to traction (what we have to do) or distraction, and we need to identify those triggers. We need to master our internal triggers and hack back the external ones (email, app notifications, etc). We need to make time for traction and prevent distraction. And a simple timetable can give you visibility to where you are “spending” your time. Even more, you can adjust the timetable to be aligned with your values. I see connections with meditation and that works for me.

And the attention needs to start with ourselves. Then the important people around us and finally work.

I enjoyed the examples of companies like Slack to help employees to disconnect and be productive. And how important is “psychological safety” in a team..

Nowadays it is the social media the evil for our lack of attention. But in the past we have had others like television, video games, radios, books, etc. It seems Socrates complained about the written word. So there is nothing new. I liked the example of Tantalus’ curse. And now I understand the curse. He was trying to reach for things that he didnt actually need.

This is a nice screensaver:

“What we fear doing most is usually what we most need do”

And at the end there is a section for kids that I think it is very useful and original.

In summary, I have enjoyed the book and gives me more reasons to carry on my goal of better focus ( deep work / indistractable).

I hope I re-read this book at some time in the future.

Docker MTU + Docker tcpdump

I am troubleshooting an issue in a docker setup with some Arista cEOS where I can’t ping inside a VRF. First I though it was a MTU issue as when you use MPLS, there is an extra tag in the L2 frame.

…But my pings weren’t that big.

Still wanted to increase the MTU because that’s the expected thing to do in your WAN links if you run MPLS and want your users in different VRFs to be able to use the full 1500 bytes.

After some searching, It seems you can change the default value using the config file as per this link:

$ ip link show docker0
9: docker0: mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:be:73:8c:d3 brd ff:ff:ff:ff:ff:ff
$ cat /etc/docker/daemon.json
{
"data-root": "/home/somebody/storage/docker",
"mtu": 1600
}
$ sudo service docker restart
..
$ ip link show docker0
9: docker0: mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:fb:c0:cf:a2 brd ff:ff:ff:ff:ff:ff

And restart docker. But still had mtu 1500. Checking another link it seems I actually need to create a container so the bridge come up with the new value

$ docker run -d busybox top
...
9: docker0: mtu 1600 qdisc noqueue state UP mode DEFAULT group default
link/ether 02:42:fb:c0:cf:a2 brd ff:ff:ff:ff:ff:ff

Funny thing, once I started my lab again (using docker-topo) still got MTU 1500!!!

Will have to dig a bit why docker-topo doesnt take the docker mtu 1600 from the config file.

Solution: docker-topo is creating user-defined bridges, so it needs to be told that the mtu is different. The “mtu:1600” in the docker config it is only for the default bridge so when you start the busybox, it is attached to the default bridge and you see 1600.

The other thing I was curious was if I could tcpdump the networks created by docker.

Yes, you can!

# docker network ls

# ifconfig 

# tcpdump -i br-xxxx 

Arancini

Sometime ago I tried these typical Italian rice bowls and like them. So I wanted to give it a go one day. The ones I tried had spinach and cheese inside. I quick search showed me videos for arancini but with meat. It looks good but wanted to try the spinach version so I went a bit free style.

These are the videos used as reference: link1 and link2

Ingredients for the rice balls

  • 1 and 1/2 cup of arborio/paella rice
  • 3 cups of boiling water
  • half onion chopped + splash of olive oil
  • pinch of sea salt
  • knob of butter
  • 1 tsp of tumeric

Process

  • Fry the onion with the oil in a deep pan until soft
  • Add the rice and mix all together for a minute
  • Add the boiling water, salt, butter and tumeric
  • Cook at middle temperature and stir often
  • Once all liquid is absorbed, spread the rice in a tray to cool down.

Ingredients for the filling

  • 500g of washed spinachs
  • half onion chopped + splash of olive oil
  • 1 garlic clove
  • splash of milk
  • 1 big tsp flour

Process

  • Fry the onion with the olive oil until soft.
  • Add the garlic and fry until golden
  • Add the spinach. They will reduce quite a lot.
  • Stir often and once the spinach are like a paste, add the milk and flour
  • Remove from heat and let it cool down

Frying the rice balls

Ingredients

  • 2 eggs
  • Breadcrumbs
  • sunflower oil (never through olive in the sink please!)
  • Cheese

Process

  • Heat up a deep pan with the sunflower oil.
  • have a plate with the breadcrumbs and another with the mixed eggs
  • For making the balls, as per videos, wet your hands, make a decent ball, and make a hole with a finger.
  • Fill the hole with the cheese and spinach.
  • Cover the ball with a bit more rice and follow the technique to shape it like an egg
  • Pass the ball by the egg, then breadcrumbs and finally into the hot oil.
  • Fry until golden

This is my result:

Veredict:

To be honest, they look as I remembered but my spinach filling wasnt as great as the ones I tried.

I think I need to use mozarella cheese and add something else to the spinach mix (salt? nutmeg?)

Next time I will try to find the Arancini recipe with spinach.

As usual, with practice, comes mastery.

Will try again.

Stoicism: How To Be Free

I finished this short book about Stoicism. As I have been meditating for over a year, I am interested in ways to keep learning and improving my quality/health of mind. I like feeling fit in my body, and my mind.

After watching some short videos about Stoicism, I liked the ideas and felt they can fit in my way of thinking.

Most of the times, Epictetus and Marcus Aurelius are the most common figures mentioned about Stoicism so I tried something written from them.

I went to the “How To Be Free” as the main source from Epictetus. I learned that he was a Greek born slave from the Roman times who earned his freedom and became a philosopher. Who more entitled to write about freedom that a slave? And I didn’t know that the stoicism had started some centuries earlier, around 300BC in Greece. As well, the “Encheiridion” was actually written by Arrian (I read a book about Alexander The Great and didnt know about his philosophy side) that was one student of Epictetus.

The book centres in what it is under our control and what is not. Things that we control are just inside us, and they are the ones that makes free (and content). As soon as you start to give away that control to outside things, you are doomed to suffering. It can be brutal in some cases. If a love one dies, it is not in your control that event, so you shouldn’t bee affected, just accept that is part of nature. Nature is nature and is not bound to our will. That reminds me too Buddhism too.

I like this philosophical approach, it matches well with me. In the world where we live with so much attention to the outside, it is good to get back to basics. We should be happy/content how we are, if we dont hurt anybody/anything and we life in harmony with nature. Suffering is part of life and we shouldn’t sell our freedom to external factors. Somebody insult you? Somebody has done something bad to you? Things are not going according your wishes? These are timeless sources of suffering and we had people already talking about this and providing guidance for a couple of millenniums. And I think we haven’t learned much apart from taking the wrong approach: take this pill, buy this, be like that person, etc etc

There is so much we can do for ourselves by ourselves. Why schools don’t teach more philosophical thinking? When I was in high school we had a subject about Philosophy that was mandatory to get access to University. But at the end of the day, they prepare you to pass an exam. Not to learn. How important is a good teacher…

Keep hungry, keep learning, keep applying, repeat.

Depression Economics

I finished reading this book from Paul Krugman. I have really enjoyed it. It is short book and got me hooked. And it is much more easier to read the Keynes book… that was proper hardcore. He explains the crisis we have seen in XIX and XX in a way that you dont need to be economist.

It is really interesting the connections of the economic crisis globally and how complex it is getting everything. It seems the only power that the governments have is print money and play with the interest rates. And it is clear that there is no a perfect system and we will carry on seeing crisis like this. There were some big figures in the economic world that said there will not be more macro economical crisis anymore. And it is funny how the IFM hasn’t followed the practices to improve economies from countries in crisis, they have made things worse.

The baby setting Co-Op is a great example that is used in several parts of the book so explain the type of crisis in that scenario. Really useful.

And seems he is honest, he doesnt have the explanations for all crisis. For example for the Asia crisis of the late 90s, he uses the psychological concept that investors put all countries is Asia in the same basket and treated some countries with stronger economies like weakest one.

And Keynes is mentioned several times. It is clear he was great (although I didnt understand much from his book).

It is clear that things that behave like a bank and they are not bank, they should play by the same rules to protect consumers and avoid crisis like the 2008.

And how important is the confidence. Even well run banks can go down extremely easy when there is a “run on the bank” (people want to take the money out of the bank). It is like a domino effect.

As in Mandelbrot book, it is impossible to foresee the economy long run… And Keynes says that in the long run we are dead.

Enjoy the moment.

MPLS Segment Routing – Arista Lab

We have been able to create some nice MPLS labs using GNS3 and Cisco IOS. In my current employer, we use Arista so I wanted to create a lab environment with Arista kit to simulate a MPLS Segment Routing network. Keeping in mind that I try to run everything on my laptop, using GNS3 + Arista is not an option. You need to use the Arista vEOS image in GNS3 and it demands 2GB RAM per device and 1 CPU. In the past, I think I just managed to start two vEOS VMs before my laptop gave up. But Arista offers a version of EOS for containers.

So, what’s the difference between a virtual machine (VM) and a container? Well, searching the internet is going to give you many all answers. In my very simplify way:

  • VM: needs an hypervisor to simulate hardware. It uses kernel and user space. It has a full OS. So it is like simulation a whole server/pc (imagine a standalone house)
  • Container: runs in user space. Set of processes that are isolated from the rest of the system. Containers provide a way to virtualize an OS so that multiple workloads can run on a single OS instance (imagine an apartment in a building)

You just need to register in Arista web page to download a cEOS image.

Regarding MPLS Segment Routing (or SPRING for Juniper) it is an evolution of the standard MPLS, that was originally developed to improve the routing performance in core networks: avoid to make a routing look-ups per packet in core devices was very expensive in 80/90s (my very simplify way). MPLS started to being deployed around end 90s and became a defacto technology in all service providers. More info here.

Segment Routing is still based in labels, but adds improvements as it doesnt need a protocol for label exchange (one less thing to worry about). As well, it is based in “source routing” as the sources chooses the path and encodes it in the packet.

There are many sources in the internet that can explain MPLS SR better than me like all these:

https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/seg_routing/configuration/xe-3s/segrt-xe-3s-book/intro-seg-routing.html

https://www.segment-routing.net/tutorials/

As we are going to use Arista, I based my learning in these presentations:

https://ripe77.ripe.net/presentations/16-20181015-SegmentRouting.pdf

https://www.netnod.se/sites/default/files/2018-03/Peter%20Lundqvist_Arista_8.pdf

And reading more Arista docs.

All the code and how to build the lab is here:

https://github.com/thomarite/ceos-testing

So what we need and what we are going to use in this lab:

  • IPv4 (yeah, I should start working in IPv6…)
  • IGP: we use ISIS
  • Label Distribution: ISIS-SR
  • BGP: using loobacks as best practices and using IGP for building a full-mesh
  • L3/2VPN: EVPN
  • All devices are PE

So let’s build the basic IP connectivity for r01:

!
hostname r01
!
interface Ethernet1
no switchport
ip address 10.0.10.1/30
!
interface Ethernet2
no switchport
ip address 10.0.12.1/30
!
interface Loopback1
description CORE Loopback
ip address 10.0.0.1/32
!
ip routing
!

Now let’s build our IGP with ISIS. We are going to use our Lo1 IP as network ID for each router. As well, we will keep it simple and define all routers as ISIS L2. We dont need anything fancy. We just want ISIS to build our iBGP peering. We will enable ISIS in the core interfaces (in this simple lab, all links and loopbacks)

!
router isis CORE
net 49.0000.0001.0010.0000.0000.0001.00  <-- BASED IN Lo1 !!!
is-type level-2
log-adjacency-changes
set-overload-bit on-startup wait-for-bgp timeout 180
!
interface Ethernet1
no switchport
ip address 10.0.10.1/30
isis enable CORE
isis metric 40
isis network point-to-point
!
interface Ethernet2
no switchport
ip address 10.0.12.1/30
isis enable CORE
isis metric 50
isis network point-to-point
!
interface Loopback1
description CORE Loopback
ip address 10.0.0.1/32
isis enable CORE
isis metric 1
!

It is seems there is a bug in the cEOS I am using as “show isis neighbors” fails but the routing is actually correct. Let’s see from r22:

r22#show ip route
VRF: default
Codes: C - connected, S - static, K - kernel,
O - OSPF, IA - OSPF inter area, E1 - OSPF external type 1,
E2 - OSPF external type 2, N1 - OSPF NSSA external type 1,
N2 - OSPF NSSA external type2, B - BGP, B I - iBGP, B E - eBGP,
R - RIP, I L1 - IS-IS level 1, I L2 - IS-IS level 2,
O3 - OSPFv3, A B - BGP Aggregate, A O - OSPF Summary,
NG - Nexthop Group Static Route, V - VXLAN Control Service,
DH - DHCP client installed default route, M - Martian,
DP - Dynamic Policy Route, L - VRF Leaked
Gateway of last resort is not set
I L2 10.0.0.1/32 [115/131] via 10.0.10.9, Ethernet1
I L2 10.0.0.2/32 [115/91] via 10.0.10.9, Ethernet1
I L2 10.0.0.3/32 [115/91] via 10.0.23.1, Ethernet2
I L2 10.0.0.4/32 [115/51] via 10.0.23.1, Ethernet2
I L2 10.0.0.5/32 [115/41] via 10.0.10.9, Ethernet1
C 10.0.0.6/32 is directly connected, Loopback1
I L2 10.0.10.0/30 [115/130] via 10.0.10.9, Ethernet1
I L2 10.0.10.4/30 [115/90] via 10.0.23.1, Ethernet2
C 10.0.10.8/30 is directly connected, Ethernet1
I L2 10.0.12.0/30 [115/140] via 10.0.23.1, Ethernet2
I L2 10.0.13.0/30 [115/90] via 10.0.10.9, Ethernet1
C 10.0.23.0/30 is directly connected, Ethernet2
r22#
r22# show logging
...
Log Buffer:
May 24 16:18:22 r22 SuperServer: %SYS-5-SYSTEM_RESTARTED: System restarted
May 24 16:24:29 r22 ConfigAgent: %SYS-5-CONFIG_E: Enter configuration mode from console by root on vty4 (UnknownIpAddr)
May 24 16:24:29 r22 ConfigAgent: %SYS-5-CONFIG_I: Configured from console by root on vty4 (UnknownIpAddr)
May 24 16:24:29 r22 ConfigAgent: %SYS-5-CONFIG_STARTUP: Startup config saved from system:/running-config by root on vty4 (UnknownIpAddr).
May 24 16:24:39 r22 Isis: %ISIS-4-ISIS_ADJCHG: L2 Neighbor State Change for SystemID 0000.0000.0004 on eth2 to UP
May 24 16:24:42 r22 Isis: %ISIS-4-ISIS_ADJCHG: L2 Neighbor State Change for SystemID 0000.0000.0005 on eth1 to UP
May 24 16:26:34 r22 ConfigAgent: %SYS-5-CONFIG_STARTUP: Startup config saved from system:/running-config by root on vty4 (UnknownIpAddr).
r22#
r22#show isis neighbors
% Internal error
% To see the details of this error, run the command 'show error 2'

Let’s build BGP, from r01 is like this:

!
router bgp 100
router-id 10.0.0.1
graceful-restart restart-time 300
graceful-restart
maximum-paths 2
neighbor AS100-CORE peer group
neighbor AS100-CORE remote-as 100
neighbor AS100-CORE next-hop-self
neighbor AS100-CORE update-source Loopback1
neighbor AS100-CORE timers 2 6
neighbor AS100-CORE additional-paths receive
neighbor AS100-CORE additional-paths send any
neighbor AS100-CORE password 7 Nmg+xbfVkywN7BBIllK5yw==
neighbor AS100-CORE send-community standard extended
neighbor AS100-CORE maximum-routes 0
neighbor 10.0.0.2 peer group AS100-CORE
neighbor 10.0.0.2 description R02
neighbor 10.0.0.3 peer group AS100-CORE
neighbor 10.0.0.3 description R11
neighbor 10.0.0.4 peer group AS100-CORE
neighbor 10.0.0.4 description R12
neighbor 10.0.0.5 peer group AS100-CORE
neighbor 10.0.0.5 description R21
neighbor 10.0.0.6 peer group AS100-CORE
neighbor 10.0.0.6 description R22
!

So once we have configured BGP in all routers, we should see a full mesh between all routers. This is from r22:

r22#show ip bgp summary
BGP summary information for VRF default
Router identifier 10.0.0.6, local AS number 100
Neighbor Status Codes: m - Under maintenance
Description Neighbor V AS MsgRcvd MsgSent InQ OutQ Up/Down State PfxRcd PfxAcc
R01 10.0.0.1 4 100 7 7 0 0 00:00:05 Estab 0 0
R02 10.0.0.2 4 100 7 7 0 0 00:00:05 Estab 0 0
R11 10.0.0.3 4 100 7 7 0 0 00:00:05 Estab 0 0
R12 10.0.0.4 4 100 6 7 0 0 00:00:04 Estab 0 0
R21 10.0.0.5 4 100 6 7 0 0 00:00:04 Estab 0 0
r22#

Now, enable MPLS and SR extension in ISIS:

!
mpls ip
!
mpls label range isis-sr 800000 65536
!
router isis CORE
  segment-routing mpls
    router-id 10.0.0.1  <-- based on Lo1 in each router
    no shutdown
!
interface Loopback1
  description CORE Loopback
  node-segment ipv4 index 1  <-- this has to be different in each node!!!
!

And you should see 5 ISIS-SR tunnels from each router. From r22:

r22#show isis segment-routing tunnel
Index Endpoint Nexthop Interface Labels TI-LFA
tunnel index

1 10.0.0.2/32 10.0.10.9 Ethernet1 [ 800002 ] -
2 10.0.0.3/32 10.0.23.1 Ethernet2 [ 800003 ] -
3 10.0.0.4/32 10.0.23.1 Ethernet2 [ 3 ] -
4 10.0.0.5/32 10.0.10.9 Ethernet1 [ 3 ] -
5 10.0.0.1/32 10.0.10.9 Ethernet1 [ 800001 ] -
r22#

As you can see above, the labels are based on the base index (800000) defined in the “mpls label range” command and the “node-segment index” defined in the loopback interface. So the label that identifies uniquely r01 is 800000 + 1 = 800001. The label “3” means you are a Penultime-Hop-P router and you remove the label to save a label look-up in the egress router.

Now, let’s configure EVPN for L2/L3VPN deployment in our MPLS network. From r01 should be:

!
service routing protocols model multi-agent --> you will have to reboot
!
router bgp 100
!
address-family evpn
neighbor default encapsulation mpls next-hop-self source-interface Loopback1
neighbor 10.0.0.2 activate
neighbor 10.0.0.3 activate
neighbor 10.0.0.4 activate
neighbor 10.0.0.5 activate
neighbor 10.0.0.6 activate
!

So once this is configured in all routers, we should see again a full mesh of EVPN BGP peers. From r12 this time:

r12#show bgp evpn summary
BGP summary information for VRF default
Router identifier 10.0.0.4, local AS number 100
Neighbor Status Codes: m - Under maintenance
Description Neighbor V AS MsgRcvd MsgSent InQ OutQ Up/Down State PfxRcd PfxAcc
R01 10.0.0.1 4 100 1254 1251 0 0 00:03:27 Estab 1 1
R02 10.0.0.2 4 100 1111 1107 0 0 00:03:27 Estab 1 1
R11 10.0.0.3 4 100 961 962 0 0 00:03:27 Estab 1 1
R21 10.0.0.5 4 100 884 888 0 0 00:03:27 Estab 1 1
R22 10.0.0.6 4 100 814 811 0 0 00:03:27 Estab 1 1
r12#

Now, let’s create a L3VPN with CUST-A vrf. We define it in all routers. For r01 should be:

!
vrf instance CUST-A
rd 100:1
!
interface Loopback2
vrf CUST-A
ip address 192.168.0.1/32   <-- each device has a unique one
!
ip routing vrf CUST-A
!
router bgp 100
!
vrf CUST-A
rd 100:1
route-target import evpn 100:1
route-target export evpn 100:1
network 192.168.0.1/32

Let’s see if the routing works from r12

r12#
r12#show bgp evpn
BGP routing table information for VRF default
Router identifier 10.0.0.4, local AS number 100
Route status codes: s - suppressed, * - valid, > - active, # - not installed, E - ECMP head, e - ECMP
S - Stale, c - Contributing to ECMP, b - backup
% - Pending BGP convergence
Origin codes: i - IGP, e - EGP, ? - incomplete
AS Path Attributes: Or-ID - Originator ID, C-LST - Cluster List, LL Nexthop - Link Local Nexthop
Network Next Hop Metric LocPref Weight Path 
RD: 100:1 ip-prefix 192.168.0.1/32 10.0.0.1 - 100 0 i 
RD: 100:1 ip-prefix 192.168.0.2/32 10.0.0.2 - 100 0 i 
RD: 100:1 ip-prefix 192.168.0.3/32 10.0.0.3 - 100 0 i 
RD: 100:1 ip-prefix 192.168.0.5/32 10.0.0.5 - 100 0 i 
RD: 100:1 ip-prefix 192.168.0.6/32 10.0.0.6 - 100 0 i
r12#
r12#show ip route vrf CUST-A
VRF: CUST-A
Codes: C - connected, S - static, K - kernel,
O - OSPF, IA - OSPF inter area, E1 - OSPF external type 1,
E2 - OSPF external type 2, N1 - OSPF NSSA external type 1,
N2 - OSPF NSSA external type2, B - BGP, B I - iBGP, B E - eBGP,
R - RIP, I L1 - IS-IS level 1, I L2 - IS-IS level 2,
O3 - OSPFv3, A B - BGP Aggregate, A O - OSPF Summary,
NG - Nexthop Group Static Route, V - VXLAN Control Service,
DH - DHCP client installed default route, M - Martian,
DP - Dynamic Policy Route, L - VRF Leaked
Gateway of last resort is not set
B I 192.168.0.1/32 [200/0] via 10.0.0.1/32, IS-IS SR tunnel index 5, label 116384
via 10.0.10.5, Ethernet1, label 800001
B I 192.168.0.2/32 [200/0] via 10.0.0.2/32, IS-IS SR tunnel index 2, label 116384
via 10.0.10.5, Ethernet1, label 800002
B I 192.168.0.3/32 [200/0] via 10.0.0.3/32, IS-IS SR tunnel index 3, label 100000
via 10.0.10.5, Ethernet1, label imp-null(3)
C 192.168.0.4/32 is directly connected, Loopback2
B I 192.168.0.5/32 [200/0] via 10.0.0.5/32, IS-IS SR tunnel index 4, label 116384
via 10.0.23.2, Ethernet2, label 800005
B I 192.168.0.6/32 [200/0] via 10.0.0.6/32, IS-IS SR tunnel index 1, label 116384
via 10.0.23.2, Ethernet2, label imp-null(3)
r12#

So, all looks good. EVPN table shows all the prefixes for rd 100:1 and the routing table for CUST-A shows all Lo2 defined in each router.

BTW, I am not able to ping inside the VRF, I think it is something related to the broadcast of ARP:

UPDATE: Arista confirms that cEOS-lab doesn’t support MPLS dataplane. I need to use vEOS (vagrant). So that means I dont think my laptop has enough resources to build this lab in vEOS 🙁

r01#ping vrf CUST-A ip 192.168.0.6 interface loopback 2
PING 192.168.0.6 (192.168.0.6) from 192.168.0.1 lo2: 72(100) bytes of data.
--- 192.168.0.6 ping statistics ---
5 packets transmitted, 0 received, 100% packet loss, time 40ms
r01#

-- from other session in r01 --

r01#bash
bash-4.2# ip netns exec ns-CUST-A tcpdump -i lo2
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo2, link-type EN10MB (Ethernet), capture size 262144 bytes
^C12:46:03.324918 02:00:00:00:00:00 (oui Unknown) > Broadcast, ethertype ARP (0x0806), length 42: Request who-has 192.168.0.6 tell 192.168.0.1, length 28
12:46:04.348750 02:00:00:00:00:00 (oui Unknown) > Broadcast, ethertype ARP (0x0806), length 42: Request who-has 192.168.0.6 tell 192.168.0.1, length 28
12:46:05.376723 02:00:00:00:00:00 (oui Unknown) > Broadcast, ethertype ARP (0x0806), length 42: Request who-has 192.168.0.6 tell 192.168.0.1, length 28
3 packets captured
3 packets received by filter
0 packets dropped by kernel
bash-4.2#

-- from other session in r22, we dont see anything --

r22#bash
bash-4.2# ip netns exec ns-CUST-A tcpdump -i lo2
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo2, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel
bash-4.2#

New Approach for Datacenter Networks and Stacks for Low Latency

In a irc channel this week, one guy posted a link about visualization latency in a data center switching network .

And it was really good video for understanding how congestion happens inside the switch infrastructure and a very original idea to overcome this problem!

I tried to get a bit more info about the video and ended in the page of that paper:

http://www.cs.ucl.ac.uk/news/article/sigcomm_best_paper_award_for_mark_handley/

And see if there was any implementation:

https://github.com/nets-cs-pub-ro/NDP/wiki

I am not a researcher but the idea is quite original and it seems you dont need to re-invent the wheel. In the github repo even there is an example in P4. P4 is going to be big, and Barefoot has already commercial solutions about it with their tofino chip. Let’s see what Intel does with it…

Based on a continuation paper, it seems there is no much traction from the big cloud providers, and it surprises me, these guys have the muscle to make this kind of things. I always heard that hardware is very expensive to built and software is not. So there are few player willing to invest in new ideas. Everytime you hear about unicorn companies, nearly all of them are software companies.

And another paper says it needs more tuning/debugging.

I don’t know if it will successful in the future but I think it was interesting watching the video and reading about the concept.