Lemon Pudding Souffle

Lemon zest garnish ingredients:

  • 1/2 lemon skin cut at julienne (long and narrow stripes)
  • 50ml water
  • 50g caster sugar
  • 1/2 lemon juice

Process:

  • Mix water and sugar in a pan. Heat up to boil water.
  • Add the lemon zest and mix for 10 sec, while water is boiling
  • Add lemon juice and keep mixing for another 10 sec.
  • Check the mix is getting some consistency then remove from heat
  • Let it cool down for later. It should be like a syrup and solidify while cooling down. You can heat it up in microware for 10-15s to make it liquid again.

Lemon Pudding souffle ingredients:

  • 125ml milk
  • 35g butter
  • 70g caster sugar
  • 70g plain flour
  • 2 eggs yolks
  • 4 eggs whites
  • 1/2 tsp vanilla paste
  • 1/2 lemon zest
  • butter and sugar for greasing

Process:

  • Pre-heat oven at 180C
  • Grease (solid butter and sugar) 4 dariole moulds. Be sure there is no shiny parts! Cut a small square of baking paper and put in the bottom of the mould.
  • Make a “beurre manie” that is mixing the butter and flour by hand in a bowl.
  • Scold the milk with the vanilla (just hit the boiling point). Then at medium heat, add pieces of the “beurre manie” and whisk non stop with the milk. It is like doing “bechamel” sauce. Once the whisk is not useful, move to a wooden spoon and keep mixing.
  • At the end you will have kind off a ball, keep cooking for 1-2 minutes to be sure the flour is properly cooked.
  • Remove from heat and let it cool down
  • Add the 2 egg yolks one at each time to the “ball”. It should become like a thick cream.
  • Whisk the egg whites until snow peak. Then add the sugar and lemon zest and whisk again until snow peak. This is the meringe
  • Now fold a 1/3 of the meringe into the pudding butter. Keep doing the same 1/3 at the time.
  • Once you have all mixed, pour the butter into the moulds up to 3/4.
  • Using a deep tray, put the moulds and fill it with water (hot if possible). Be careful the moulds dont float!
  • Bake for 20-25 minutes until golden on top and risen.
  • Prepare the egg custard while waiting

Fresh egg custard sauce ingredients:

  • 75ml double cream
  • 75ml milk
  • 35g egg yolk
  • 25g caster sugar
  • 1/2 tbs vanilla paste

Process:

  • In a glass, mix egg yolks and sugar.
  • Mix double cream, milk and vanilla in a sauce pan. Add a bit of the sugar and put to the boil.
  • Add the egg mix to the liquid and mix with a wooden spoon.
  • Keep mixing until the liquid thick up a bit! Dont over do it! If you ran a finger in the back of the spoon when covered with the custard, it should keep apart.
  • Remove from heat and pour it in a container. Cover with cling film

Presentation:

  • Once the pudding moulds are ready, remove the mould, be sure you take off the paper too!
  • Put each pudding in the middle of a dish.
  • Pour the custard around the pudding
  • Put a spoon of the lemon syrup on top of the pudding (put in microware if solid)

This is it!

It was actually quite nice. Spongy and lemony.

Dundee Cake

This is a typical Scottish cake and I learned today via an e-course.

Ingredients

  • 180g softened butter
  • 90g caster sugar
  • 90g dark brown sugar
  • 115g plan flour
  • 110g self-raising flour
  • 1/2 tsp baking powder
  • 300g dry mixed fruit
  • 1 tbsp of fresh grated ginger
  • 1 orange: juice + zest
  • coconut flakes for decoration
  • 50ml Scotch whiskey (optional)
  • 1 tsp grated nutmeg
  • 1 tsp ground ginger
  • 2 tsp ground cinnamon
  • optional: whiskey butter: 75g softened butter + 75g icing sugar + 1 tbsp whiskey

Process:

  • Pre-heat oven at 150C. Grease and line a baking tin 20x30cm aprox
  • Soak the dried fruit with the whiskey, orange juice, orange zest and fresh grated ginger.
  • In a bowl, cream the butter and sugar until light and fluffy
  • Add the eggs one by one to the butter mix. Keep mixing
  • Sift the flour, spices and baking powder on the butter mix. Fold everything with a wooden spoon or spatula.
  • Drain the soaked fruit, reserve the liquid for later! Add the fruit into the dough, folding it.
  • Pour the mixture into the tin and smooth the surface.
  • Bake at 150C for 1h 20-30m aprox. Until golden and springy to touch.
  • While baking, prepare the whiskey butter. Whip the butter and icing sugar. Add the tbsp of whiskey. Whip and done.
  • Once the cake is out of the oven, immediately, brush the top with the liquid leftover (add one tsp of sugar) and then add the coconut flakes.
  • Leave it cool down

It taste a bit like a Christmas pudding but lighter. I liked it!

Calling Bullshit

This is an interesting book about the flooding of data we need to go through and the difficulty to figure out what is true or not. And I feel it many times you read something “scientific” with many numbers, stats, etc and you kind off believe that has to be true. And those new pharmaceutical drugs that are so amazing or latest paper with a dramatic breakthrough.

Interesting points:

With the hype about machine learning, understanding the algorithm may be out of our understanding but the critical thing is the training data fed into that algorithm. GIGO = Garbage In, Garbage Out. Because the training data is “biased” or not relevant, imagine how is going to be the result.

Correlation is not causation. This is a difficult topic becase we see very easily causation everywhere or find one that matches our theory.

Goodhart’s law adapted to normal people: “When a measure becomes a target, it ceases to be a good measure”. That’s so true. Think of your performance review at work, the GPU tests, etc.

Regarding the stats, it is important to pay attention to the axis: start at zero? same proportions/scales?, be mindful of 3D stats, “ducks” decorate or obscure the meaningful data,

If it is too be good to be true/false, then it isn’t.

“mathiness”: formulas and expressions that look like “good” math but they lack logical coherence and formal rigor. This is very typical for things that are not really easy to quantify (ie healthcare quality management), how things are measured?, unit? etc

One of my favourite examples is the paper about the fMRI on the brain of dead! salmons when showing picture of people showing different types of mood. This was important to clarify that MRI images maybe are not as perfect as you expect. I assume that nowadays that has improved….

Prosecutor’s fallacy: You need to prove you customer is innocent although there is DNA match in a database. There is an error rate of 1 in 10,000,000.

MatchNo Match
Guilty10
Innocent550,000,000

You are the defence prosecutor and you want to focus in the left column (blue). That means that there are 5 chances out of 6 (5+1) that your client is inocent having a DNA match.

p-values: null hypothesis and alternative hypothesis. Most papers are based on a p-value of 0.05 (now you have Goodhart’s law)

Refuting Bullshit:

  • Use “reductio ad absudum”
  • Be memorable (dead salmon example)
  • Find counterexamples (immune system theory vs trees)
  • Provide analogies (74M$ -> 2sec faster)
  • Redraw figures
  • Deploy a null model

I leave a lot of things behind that I dont remember but it is worth the reading (and more than once)

In summary, the goal is to be “smart” sceptic and dont believe everything throw to us.

Chelsea and Swiss Buns

This recipe makes:

2x chelsea buns tins of 22cm diameter

6x swiss buns

Ingredients for the dough:

  • 850g strong white flour, plus extra for dusting
  • 25g dry yeast
  • 100g caster sugar
  • 475ml warm milk
  • 2 egg beaten
  • 100g unsalted butter, melted, plus extra for greasing
  • oil for greasing
  • 1 tsp sea salt

Filling ingredients:

  • 25g melted butter
  • 100g currants, chopped
  • 50g sultanas, chopped
  • 2 tsp mixed spice
  • 25g caster sugar

Glaze:

  • 200ml water
  • 200g caster sugar

Process:

  • Mix together the flour, yeast, sugar and salt. Make a well in the center.
  • Warm the milk at body temperature
  • Pour milk, melted butter and beaten eggs into the flour. Mix all together and then knead for 10 minutes until you have a smooth and elastic dough. It will be a bit sticky at the very beginning but dont add any flour.
  • Shape the dough into a ball, add a bit of oil into the bowl, put the dough, cover with clean film and let it prove until double in size (1h?)
  • Pre-heat oven at 180C
  • Prepare a tray with butter and a coating of flour to make it anti-sticky.
  • Prepare the chelsea buns tins. Put a layer of melted butter, then add a piece of baking paper of the same diameter to cover the bottom. Cover again with a bit of melted butter. Tray to cut another piece of baking paper for the side of the tin. Add a bit of butter again.

Swiss Buns:

  • Once the dough has double in size, put it on a lightly flour surface and knock the air out.
  • Cut 6 pieces of 80gr aprox. Cover the pieces and the big dough with the bowl and a towel so they dont dry out.
  • For each piece, make a ball, flat it with your palm, put pressure and make circules with your hand, remove pressure gradually and until you have a ball again.
  • Then roll it as it were a “cigar”. Put it in the tray. Leave around 3cm distance between each “cigar”.
  • Prove again until double in size (25-30m). Move to the Chelsea Buns.
  • Then bake for 6 minutes aprox or until golden light brown.
  • Let it cool down before adding the sugar coating.
  • Optional: make white fondant icing. For each Swiss bun, dip only the top, the lift it, let the fondant to drip by one side and then with a finger, clean the fondant drip without leaving the finger mark.

Chelsea Buns:

  • With the left-over dough and using a rolling-pin, spread the dough into a big rectangle. At least 42x25cm.
  • With a brush, cover the dough with the melted butter.
  • Mix sugar and spice, then spread it over the dough.
  • Spread evenly the chopped currants and sultanas.
  • With the roll, push a bit the dry fruit.
  • Make a roll with the dough, start by the “long” side. Be sure it is tight but dont break it.
  • With a sharp knife, cut both end until you have an even form. Then cut 12-14 pieces from the roll.
  • Put one piece in the middle, then put another at 5 or 6 other pieces around as it were a flower. Be sure the end of each piece points to the middle.
  • Put for tins to prove again. Aprox 25 minutes. They should grow until there is no space between each piece and raise a bit too.
  • Bake until the top is golden brown. At 180C, aprox 15-20 minutes.
  • Glaze with the sugar syrup as soon as the tins are removed from the oven. Remove the buns from the tin!

This is the result!

Other Minds

This is a book recommended by a good friend. He had watched some documentaries about octopus and was amazing. So I was curious about it and gave it a go.

The book is not just about octopus and cuttlefish but about intelligence based on the evolution of our nervous systems. It seems the octupus developed their nervous system in a different way from mammals. And even between cephalopods seems to have evolved in more than one way.

Another things I was quite surprised is the life span of the octopus is around 2 years! There is a part of the book quite interesting about ageing. Why are there organism like sequoias that can live over 3000 years and then octopus with a very advance nervous systems only last 2? I need to re-read it again. As per my understanding this is related to the our evolution, we reap the benefits quickly but there is always drawbacks that turn up later.

There were parts were I didnt engage enough but I think it was worth it just for the two points above.

API scripts: CML and Vsphere

In the last months I have been trying to practice the knowledge from CCNA DevNet and managed to wrote very basic scripts using API.


CML is a simulation solution from Cisco that actually works (you have to pay and need hardware). There are nice docs out there:

CML sandbox

CML Starting guide + API examples (this saved me)

CML simulation lifecycle example (havent tried yet)

CML: I actually struggled in this one regarding how to authenticate….. Checking some pages I worked out. I was trying to use the theory from DevNet but no joy. The idea is to restore a lab after you have made many changes. The script log into CML, delete the lab and restore with a provided backup.


Some docs for vsphere api:

Vsphere create session: (quite useful)

vcenter REST API authentication

vcenter API with Postman

Vsphere postman examples (need to take a look)

Vsphere: This is even simpler…. at least in this one I didnt struggle with the auth as the documentation was quite clear. But the original goal of restore a VM from a snapshot looks like is not available via API.

OpenAI / ChatGPT

I guess this is a bit old news already. But I wanted to test it for some weeks. I have just made one question so far (and it failed a couple of times until I got an answer.

And my very first question was to list the most famous quotes from Aristotle to his actual writings. Why? Because you read many quotes from many famous people that are not actually true and it is very hard to really track if you dont read all books from that person. And to be honest, philosophy is not easy to digest.

Here are a few famous quotes from Aristotle that reference his actual writings:

"Happiness is the meaning and the purpose of life, the whole aim and end of human existence." - Nicomachean Ethics

"It is the mark of an educated mind to be able to entertain a thought without accepting it." - Metaphysics

"We are what we repeatedly do. Excellence, then, is not an act, but a habit." - Nicomachean Ethics

"The whole is greater than the sum of its parts." - Metaphysics

"Nature does nothing in vain." - Politics

Please note that these quotes should not be taken as direct verbatim from Aristotle's work but rather as a representation of his thoughts.

Apart from that, I wonder how is the infrastructure of this company? Is on premise or cloud? How is their DC network design? TCP? Checking one of the open positions it seems they have a pretty big cluster. Although they mention cloud platforms. And the company has some kind of agreement with MS (so Azure is going to be used)

Donuts

Dough Ingredients (12 units):

  • 660g strong flour
  • 75g sugar
  • 5g salt
  • 60g unsalted butter
  • 30g liquid egg
  • 20g dry yeast
  • 150g warm water
  • 150g milk

Filling/Coating

  • egg wash
  • 100ml boling water + 100g sugar + cinnamon stick + star anise
  • jam
  • double cream: whipped + vanilla essence or 1tsp of coffee.

Process

  • Warm water and milk at body temperature. Then add yeast and mix.
  • In a bowl, mix flour, salt and sugar. Then rubber the butter until you have a breadcrumb look mix.
  • Add egg to the milk and mix.
  • Add liquid mix to the flour.
  • Mix all together until forming a ball.
  • Knead on the table until you have a smooth ball. When you poke it, it should spring back in slow motion. For 7-10 minutes aprox.
  • Put the dough in the bowl, with a bit of oil, cover with cling film and let it prove until double in size. This is the most difficult thing to do at home. Try to put in a switch-on oven with a bit of boiling water in a small pan to simulate a proving machine (40C?). In a proving machine takes 1h.
  • In a lightly flour surface, tip your dough and roll it as a long ping. Use a scale and cut 12 pieces. It should be 92g each aprox.
  • With each piece, make a ball, with one plant of your hand, push to the table and start rolling releasing pressure until a ball forms. Put each ball in a tray with baking paper. Cover the pieces while you make the balls.
  • Prove the balls again until double in size. Try the oven again with a bit of boiling water. 1h aprox. You can as well, try donuts with a whole. Just with a dusted finger, make a whole in the middle and spin the dough. Be sure the whole is quite big in diameter and when it proves may close.
  • Second most difficult part. Deep frying. Keeping the temperature is critical so your will need a temperature prove and patience. Be sure the oil is at 170C. Add two balls (max) in you have a deep fryer. If you use a sauce pan, fry one ball at each time. Fry aprox for 1 minute each side. Ideally you should have a white lien in the middle. If the oil is below 170C the donuts will get soggy, too hot, they will burn. So again, be patience.
  • As well, you can bake in the oven, aprox 10 minutes at 180C. Just give it an egg wash and bake until golden. Use a toothpick to check they are baked inside. Then give it a sugar-water coating immediately after taking out of the oven.
  • For the deep-fry donuts, let them cool down. Then coat them with a mix of sugar/cinnamon.
  • Now you can add the fillings. With a knife, make a hole in the white line and pipe your filling. For the oven ones, cut in the middle around 3/4 and pipe your fillings.

This is my outcome:

I filling some with coffee cream, jam and plain cream.

They were really nice. I didnt expect this result. Again, critical is the proving and deep frying at the correct temperature.

Building DCs with VXLAN BGP EVPN

VXLAN/EVPN is a technology that I am trying to understand in more detail and depth since I started my current job. All my networking theory/knowledge comes from books so this one is a good base. Keep in mind that is a bit “old” as it was released on 2017. In the last months I have built my confidence with VXLAN/EVPN via some issues and testing designs (Arista EVPN L3 Gateway).

As I used to do in the past, I made notes of the book and I will put them here too so it is a good refresh.

1 INTRO

STP for DC issues:

  • Convergence: tree recalculation
  • Unused links: follow tree…
  • Suboptimal forwarding: follow tree…
  • No ECMP
  • Traffic storm (no TTL in L2)
  • Scale: only 4k vlans (12 bits tag)

Leaf/Spine improvements as per above (Clos Network):

  • Scalability
  • Smple
  • REsilience
  • Efficience
  • No oversubscription
  • ECMP
  • Deterministic latency
  • scale out -> + leaves // scale up (+bw) -> + spines

BUM = Broadcast, Unknown Unicast, Multicast

Fabric Path = MAC in MAC (technology earlier to vxlan). Proprietary to Cisco

VXLAN = Standard, MAC in IP/UDP, VNI = 24 bits -> 16M valns! Flood & Learn (F&L): each network has its L2VNI + multicast group (control-plane). BGP EVPN doesnt need F&L, so better control-plane

Border Leaf or Border Spine = for external connectivity.

Route-Reflector (RR) or RP (multicast) in Spine

VXLAN – dataplane / EVPN – control-plane

2 BASICS

In DC, most traffic is east-west.

Limit vlan 12 bits (4k) + multi-tenancy? -> overlay: (indirection) abstraction of existing network tech + extend classic network capabilities. (David Wheeler: problem -> indirection)

Underlay -> increase MTU! (overlay overhead)

Handle BUM in underlay? Multicast (PIM – VNI mapped to multicas group = dst IP outer header) or Ingress Replication (head-end replication)

VTEP = Edge device, encap/decap, build overlay

VNI – Virtual Network ID

VXLAN header: original inner 802.1q header of l2 frame is removed and mapped to a VNI, to complete vxlan header. UDP dst port = 4789, src port = based on inner header

overhead = 50 bytes (14 (l2) +20 (l3) + 8 (l4) + 8 (vxlan) = 50). 54 bytes if optional 802.1q tag (4bytes) is added.

ECMP: 5-tuple: src IP, dst IP, proto, src port, dst port -> but only src port changes in vxlan -> that;s the entropy!

F&L: Doesnt scale, no control-plane. Multicast replication has a limit. So Ingress replication (IR) . Every VTEP must be aware of other VTEPS in same VNI. Source VTEP has to replicate each packet to each VTEP.

BGP EVPN: solution for F&L. Eliminates unnecessary flooding. EVPN carries host MAC, IP, network, VRF and VTEP info. If a VTEP that detects a host and doesnt send an EVPN update -> remote VTEP doesn’t age out entry for that host.

IMPORTANT! Broadcast traffic (ARP, DHCP, etc) is still flooding!!!

Tenat += VRF

eBGP = implicit next-hop-self for originated

RD = 8 bytes

— type 0: 2 byte ASN + 4 byte value

— type 1: 4 byte IP + 2 byte value

— type 2: 4 byte ASN + 2 byte value

RT: control import/export prefixes in VRF (auto derivation ASN:VNI)

VXLAN EVPN: RFC7432. Focus NVO (Network Virtualization Overlay) -> Route-Types:

— type 2: MAC/IP (host: /32 or /128). Sent once host is learnt. Info about IP is optional. Ext Community: RMAC = Router MAC = source VTEP.

— type 3: Inclusive multicast ethernet tag route -> create distribution list for IR. Generated and sent out immediately as a VNI is configured. Need ASIC support !!!

— type 5: IP prefix route (L3VNI)

“show bgp l2vpn evpn MAC” ->

[Route Type]:[Eth Segment ID]:[Eth Tag Id]:[MAC lenght]:[MAC]:[IP prefix]:[bit count]

bit count:

— 216: type2 only MAC

— 272: type2 IP/MAC

— 224: type5 IPv4

ExtCommunity: ENCAP:8 -> it is VXLAN

ARP request triggers an IP-MAC.

MAC learnt via BGP is not aged-out via normal process: only BGP delete message deletes the MAC

L3 learnt: depends on hw (FIB)

— HRT (Host Route Table): only for /32 or /128 (big)

— LPM (Longest Prefix Match): TCAM (small)

FIB: [Bridge Domain, RMAC] -> BD maps to L3VPN and RMAC maps to dst VTEP MAC

Type5:

— advertises first-hop-routing: prefix where VTEP is default gateway (IP anycast gateway)

— advertise prefixes from other protocols

Host detection:

ARP aging = 1500 sec -> If ARP request fails -> type2 deletes are sent. ** Even when ARP entry is deleted, MAC only type2 is still in BGP EVPN CP until MAC aging expires (1800 sec) (sent BGP withdraw)

ARP aging < MAC aging -> avoid unnecessary flooding

Host mobility: VM to send GARP (gratuitous ARP): Highest MAC mobility seq ext community => Best

3 FORWARDING

  • Handling BUM or multidestination traffic:

— MC replication in the underlay:

Use MC in UL => 1xL2VNI = 1xMC IP => problem: 2^24 VNI available -> is a stretch for MC IPs available, sw/hw limits (1000’s PIM, IGMP, etc) -> doesnt scale

How to manage VNI-MC mapping? VNI randomly assigned to MC or MC is localized for a set of VNIs.

— Ingress Replication (IR = HER = Head-End-Replication): Unicast mode. VTEP makes n-1 copies of BUM packet and send them as unicast to the n-1 VTEPs of that VNI

replication list? dynamic with BGP EVPN. type 3 (IMET). Replication list is updated when config of a L2VNI in a VTEP occurs –> Big overhead compared with MC.

  • ARP Suppression:

— Use ARP snooping. ARP request -> populates BGP EVPN CP. 1) If VTEP knows dst MAC, then responds (this is ARP suppresion). If not, using IR or MC, sned ARP to all vTEP. Egress VTEP that has the host connected, receives ARP reply, makes a EVPN Typ2 announce to all VTEPs + send ARP reply (as unicast) to avoid any delay.

NX-OS uses MC for BUM by default = flood L2 locally and to all VTEPs in VNI.

MC group for overlay != MC group for underlay.

IGMP snoopnig (if supported), optional solution, it doesnt depend on hw, just sw.

  • Distributed IP Anycast Gateway: Implemented at each VTEP, reduces traffic transit. Anycast = ne to the nearest association.

Anycast GW VTEPs share the same MAC -> prevent black-holing for host-mobility (AGM = Anycast GW MAC address). Same AGM is used in all default gw IPs -> no hair-pining.

  • Integrated Routing and Bridging (IRB)

— Asymmetric:

bridge-route-bridge at local VTEP

traffic eggresing towards a remote VTEP uses a different VNI than the return traffic from the remote VTEP

requires consistent VNI config in all VTEPS

— Symmetric (NXOS):

bridge-route-route-bridge

egress and return use same L3VNI. L2VNI are not used for routing in symmetric IRB

Not all VNIs need to be configured in all VTEPS but for a VRF, L3VNI needs to be configured in all VTEPs.

Inter VRF routing -> route leaking -> external router or firewall.

  • End Point Mobility:

BGP extended community = MAC mobility seq. Higher wins. With each move, seq++

End point move triggered by (update via BGP EVPN CP)

— Reverse ARP: only advertises new MAC

— Gratuitous ARP: adverts new MAC/IP

VTEP verifies if endpoint has actually moved.

  • VPC: MCLAG + LACP: Cisco -> vPC: 2 devices: 1 peer link + 1 keepalive link.

— PIP: primary IP. individual per VPC member per VTEP

— VIP: secondary IP in nve interface. Virtual IP = anycast VTEP at VPC level. It is the next-hop used in EVPN typ2/5. ** anycast VTEP != anycast gw

–orphans: blackhole if using VIP -> solution: “advertise-pip” VPC members use PIP instead of VIP for NH in originated EVPN type5 (type2 still uses VIP)

— Router MAC ext community in typ2/5:

—- PIP uses switch RMAC

—- VIP uses local derived MAC based on VIP. Both VPC members derive the same MAC because the share the same VIP. As RMAC ext community is non-transitive and VIP are unique, no issue

  • DHCP: discovery, offer, request, offer. DHCP relay: configured in default gw: relay agent uses default gw IP in the GiAdr field of DHCP payload. DHCP servers uses GiAdd field to find correct scope. As well, uses GiAddr as dst IP for the answer. Problem with anycast gw because all VTEP uses the same IP -> sol: each VTEP dhcp relay uses unique IP (lox) and must be routable. how to choose scope? DHCP option 92.

4 UNDERLAY

  • Considerations

— Clos network = each port equidistant + consistent latency => multistage.

— MTU: vxlan -> avoid fragmentation. vlxan overhead = 50 bytes (14 outer MAC header + 20 outer IP header + 8 vxlan header — extra 4 if QinQ in VNI). Normal ethernet MTU 1500 -> Ethernet Frame = 1518 (or 1522 if 8021q) 18 = 6 MAC src + 6 MAC dst + 2 ether type + 4 FCS. If using vxlan => MTU 1450. If using jumbo frame 9000 => vxlan is 9050. Most network kit supports up to 9216 MTU

— IP Addressing: RID = lo. Use /31 or unnumbered (lo is used for RID) as much as possible. Lo0 (BGP) and Lo1 (VTEP) on IGP. Leaf = Lo0 + Lo1. Spine = only Lo0 because it is not vtep (if using multisite gateway need lo1 as it is vtep). Be sure your ip schema aggregates!!! -> reduce routing table (1x/24 all lo0, 1x/23 all p2p, etc)

  • Unicast Routing

— IGP is OSPF or BGP -> ECMP.

OSPF: use p2p type instead of broadcast -> only LSA-1 !!! low convergence time !!! and small LSDB. If ipv6 -> ospfv3 -> dual stack… two protocols!

ISIS: no IP, works on L2 (CNLS). SPF algo. TLV. NSAP addressing. IP independent.

BGP: path vector (no SPF) if eBGP -> next-hop unchanged (if spine not a vtep). underlay eBGP -> phy to phy // overlay eBGP -> lo to lo (multihop!). If eBGP “route reflector” => “retain router-target all”. If using “Two AS” design (if Spine no vtep) -> spine: ipv4 + evpn => “disable peer-as-check” // leaf: ipv4 + evpn => allowas-in

  • Multicast Routing: more efficient than unicast but needs one extra protocol

— BUM traffic: unicast mode = ingress replication in underlay // multicast mode = use multicast in underlay.

— Unicast: VTEP host to generate n-1 copies of packet. Replication of data traffic is data plane operation. VTEP-VNI membership distribution is dynamic via CP BGP EVPN or static via FnL (doesnt scale!).

— Multicast: PIM Any Source Multicast ASM (PIM SM) or PIM BiDir (depens on hw). Can’t mix PIM modes. RP in Spines!

— PIM ASM Anycast RP: in each spine. 1 IP for all spines -> load balancing. 9S,G) at VTEP.

— PIM BiDIR: (*,G) at RP = Spines. Difference with Anycast, BiDir creates only a shared tree (*,G) on a per multicast group instead of creating a source tree (S,G) per VTEP per multicast group. Redundancy achieved with “phantom” RP that uses lo with different prefix length.

5 MULTITENANCY (L2-> vlan / L3 -> vrf)

  • Bridge Domain: Broadcast domain that represents the scope of a L2 network (vlan). Way of stretching a vlan -> vlan (12bits), vni (24 bits), switch.

  • VLANS in VXLAN: vlan local significant, vni is global significant (per switch, per port)

— L2VNI: RD -> RID: vlan+32767. RT -> autogenerate / AS:l2vni (RT+eBGP is manual at underlay)

  • L2 Multitenancy:

— VLAN mode: restriction 4K to VNI mapping per switch.

vlan 10
  vn-segment 30001

— Bridge domain mode: BD is used instead of vlan-mode. BD implements a BDI instead of a SVI. No retrictions of 4k VNI mapping -> hw restriction:

  • VRF in VXLAN BGP EVPN: VRF-Lite doesnt scale. L3 at Leaf. EVPN -> scale CP -> RD+RT

  • L3 Multitenancy: L3VNI global scope, vrf name is local significant. Auto: RD= RID:VRF_ID / RT= AS:L3VNI (RT+eBGP is manual at underlay)

— Summary: 1) Associate L3VNI into VTEP interface 2) core-vlan associated with L3VNI 3) SVI created in VRF

router bgp X
 vrf VRF-A
  addressing ipv4 unicast
    advertise l2vpn evpn
---
interface nve1
  member vni 50001 associate-vrf
---
vlan 2501
  vni-segment 50001
---
interface vlan 2501
  vrf member VRF-A
  no shut
  mtu 9216
  ip forwarding
---
vrf context VRF-A
  vni 50001
  rd auto
  address-family ipv4 unicast
  route-target both auto
  route-target both auto evpn

6 UNICAST FORWARDING

  • Intra-Subnet Unicast Forwading (Bridging) (Classic Ethernet)

— ARP suppression disabled: ARP request -> BUM mode => Multicast or IR -> BGP EVPN for source MAC

— ARP suppression enabled: ARP snooping -> source MAC -> generated EVPN type2. If dst MAC is know by ingress VTEP then it generates ARP reply (ARP proxy)

— commands:

show bgp l2vpn evpn vni-id 30001
show l2route evpn mac all        <--|-- verifies FIB is updated
show mac address-table vlan X    <--|

// Anounce IP L3 GW manually
interface vlan 10
 vrf X
 ip address a/b tag 12345
---
route-map RM permit 10
 match tag 12345
---
router bgp Z
 vrf X
  address-family ipv4 unicast
     advertise l2vpn evpn
     redistribute direct route-map RM
  • Inter-subnet unicast forwarding (routing)

Symmetric IRB (bridge-routing-routing-bridge): VXLAN-router traffic uses same L3VNI in each direction. VRF -> l3vni -> mapping in all VTEPs.

— Distributed IP Anycast GW: anycast GW MAC (AGM) It is a VTEP. local routing in a VTEP -> no vxlan is used.

— Distributed behind remote VTEP (routing) -> vxlan > inner MAC header (SMAC = VTEP1 router MAC / DMAC = VTEP2 router MAC). RMAC is encoded in BGP EVPN NLRI as extended -community.

— Silent Hosts:

— Dest IP unknown + dst bridge domain is local to ingress VTEP => IP lookup hits LPM (ie /24) -> because L3 distribution IP Anycast FW -> chose local route (lowest AD) -> trigger ARP request for dst IP (because unknow) in different VNI !! -> BUM forwarding -> reach other VTEP.

— No L2 extension present:

show bgp l2vpn vpn vni-id X   <-- 1) verify BGP RIB
show bgp ip unicast vrf Y         2) verify RIB (RT worked fine)
show ip arp vrf Y                 3) verify FIB
  • Forwarding with dual-home endpoints: VPC -> anycast VTEP = VIP. Egress (outer src IP = VIP when traffic leaving ingress VETP). Ingress (outer dst IP = VIP when return traffic leaves egress VTEP -> ECMP to either of VTEP behind VIP)

— orphan: traffic may cross VPC peer-link because NH=VIP. L2/L3 announcements in VPC -> NH=VIP. If routing needed between VTEP1 and VTEP2 (both belong to same VPC) -> BGP or VRF-lite or advertise type5 with “PIP” from each VTEP instead of VIP (preferred)

  • IPv6: Anycast GW MAC (AGM) is shared between ipv4/ipv6. Underlay only ipv4 -> overlay ipv6 communication => NH VTEP=ipv4.

7 MULTICAST FORWARDING

Handling MC in overlay.

EVPN Type3 -> (unicast is used to handle BUM) VTP announces interest in a L2VNI

Initially not VXLAN L2 MC without IGMP snooping => L2 MC flooded to all VTEPs in that VNI even if not interested.

  • L2 MC forwarding = Intra-subnet MC. Same VNI = broadcast domain. In MC mode, underlay maps L2VNI to MC group.

— IGMP in VXLAN BGP EVPN:

— Classic IGMP snooping: Traffic is still flooded unconditionally as long as VTEPs are member of that VNI. MC is dropped at VTEPs egress.

— Improved IGMP snooping: “ip igmp snooping disable-nve-static-route-port” -> conditional addition of a VTEP to the Outgoing Interface List (OIL) for a given VNI.

  • L2 MC forwarding in VPC: one of the two peers of VPC -> elected DF (lowest cost to RP). Election process: Both VPC peers send PIM join to RP using Anycast VTEP IP (secondary IP in lo1). RP sends only 1 reply to anycast IP, this is hashed to one VPC peer -> the peer with the (S,G) is the DF (S=VTEP anycast IP, G=MC VNI mapping)
  • L3 MC forwarding = inter-subnet MC. Not much info, something expected in 2017.

8 EXTERNAL CONNECTIVITY

  • Placement:

— Border Leaf: VTEP, few flows N-S. Extra hop. No end-points. SS doesnt ned to be a VTEP.

— Border Spine: Spine becomes VTEP. Most flows N-S

— Extended L3 connectivity (L3 handoff):

— Wiring:

—- Full mesh: most resilient, no require sync between border nodes.

—- U-shape: sync link between border nodes.

— VRF-Lite/Inter-AS opt-A: BGP + redist + summarization, 802.1q. VRF-Lite-> SVI (needs BFD), subinterface (recommended) + ebgp

— Extended L2 connectivity: End-point mobility -> RARP (non-IP)

  • Classic Ethernet + VPC: VPC -> anycast VTEP IP (secondary IP in lo1) -> NH = anycast VIP (type2). “advertise pip” for type5 NH = VPC physical IP (primary lo1).

* BPDU not transported in VXLAN -> Use VPC between STP switch and VTEPs.

  • Extranet + Shared Services: Internet, DNS, DHCP, etc.

— VRF route-leaking: tenant VRF <-> shared VRF (dhcp, dns, etc) -> route leaking: CP leaking at ingress VTEP, DP leaking at egress VTEP. VXLAN uses VNI associated with source VRF for remote traffic. Problem: force consistent config in VTEP with leaking. Scalability (asymmetric IRB)

— Downstream VNI assigment: egress VTEP dictates the VNI to be used by ingress VTEP with downstream VNI-assigment via CP

9 MULTIPOD, MULTIFABRIC, DCI

  • OTV vs VXLAN: VXLAN frame similar to OTV. OTV is transport agnostic IP-based solution.

— OTV includes CP and DP. VXLAN only DP (it needs BGP EVPN for CP)

— OTV provides multihoming (redundancy) using DF on per VLAN, doesnt need VPC. VXLAN needs VPC to provide multihoming.

— OTV has loop prevention. VXLAN needs BPDU guards + storm control.

— ARP suppresion enabled in both. Unknown multicast is dropped in OTV. VXLAN+EVPN doesnt stop unknow unicast.

  • Multipod: LS-SS + super spine layer. Prefix scale MAC/IP? Spine or Super-Spine needs to be BGP RR. MC -> escale Output Interface list (OIF). Max 65k LS. Single DP extends pod to pod = single fabric.

  • Multifabric: Difference from multi-pod, complete segregation CP and DP -> interconnect at border -> stitching VNIs, -> DCI design.

  • Interpod / Interfabric: Broadcast storm in overlay reaches all pods if L2 extended to all pods.

— opt-1: Multipod, single DP end to end. problem: failure domain, no separation pods (vxlan encap end to end)

— opt-2: Multifabric: DCI at border of fabric using classic Ethernet (VRF-lite + 802.1q). Better scale, MAC/IP not spread across all VTEPs (VXLAN encap only inside fabric). VXLAN ends at border device. Problem: DCI is bottleneck.

— opt-3: Multisite: option2 + re-originate L3 routing info (MPLS L3EVPN) VXLAN ends at border fabric -> DCI encap in MPLS -> other end removes MPLS and then back to VXLAN.

— opt-4: Multisite L2: option 3 for L2. OTV or EVPN. VNI-VNI stitching.

* Multiste EVPN VXLAN using BGW -> IETF draft-sharma-multi-site-evpn 2016

10 L4-7 SERVICES INTEGRATION

  • Firewalls in VXLAN BGP EVPN:

— routing mode: use L3

— bridging mode: “bump in the wire”, VLAN stitching

— FW redundancy with static routing: ok if HA FW connected to same LS pair (VPC). If FW in different LS -> suboptimal routing -> 2 solutions: 1) static route tracking, 2) static route at remote LS -> static route in ALL LS that need to reach the FW -> LS will learn type2 of FW via active LS.

  • Inter-Tenant / Tenant-Edge FW: security enforcement at edge/exit of a tenant/VRF. VRF stitching located at Border LS.

— Inra-tenant FW: E-W firewall = FW inside VRF.

— deployment:

—-FW route mode + default GW for all VLANs => VXLAN only at L2 => no VRFS, no anycast gw.

—- FW bridge mode: all network belong to same subnet. VXLAN + distributed IP anycast gw. FW connected to distributed IP anycast GW LS.

—- PBR: Policy-Base Routing

— Mixing intra-tenant and inter-tenant:

— Intra-tenant:

—-L2 (E-W): FW is GW. LS only extends L2 -> vxlan only l2, no distributed IP anycast gw. BL trunk to FW to extend L2.

—- L3: LS uses distributed IP anycast gw.

— Inter-tenant: default route pointing to FW -> redistribute via BGP EVPN

  • Load Balancer: “statefull”

— one-arm source-NAT: LB connected with 1 link / PO to LS.

— Direct VIP subnet approach: LB VIP + LB physical IP in same range. VIP advertised via type2

— Indirect VIP subnet approach: needs static route (like FW example) -> type5.

— source-NAT -> client IP is hidden, servers return traffic to LB

— service chains: LB+FW: FW belongs to BL, LB belongs to Service Leaf. If 2-Arm LB -> VRF-transit between FW-LB. If 1-arm LB -> no transit-vrf, source NAT.

11 FABRIC MANAGEMENT

  • POAP: out-of-ban (mgmt port) needs dhpc relayy. inband (front panel ports)
  • NRFU
  • OAM:
show mac addres-table
show l2route evpn mac all
show vlan id X vn-segment
show bgp l2vpn evpn vni-id Z
show bgp l2vpn evpn MAC
show ip arp vrf Y
show forwarding vrf Y adj
show forwarding up local-host-db vrf Y
show l2route evpn mac-ip all
show bgp l2vpn evpn IP
show ip route vrf Y IP
show nve internal bgp remote database
show nve peers detail

ping nve up unknown vrf X payload IP DST SRC port SRC DST proto 6 payload-end vni 50000 verbose
traceroute ...
pathtrace ...

IPv6 BIG TCP / Replace TCP in DC: Homa

This week a colleague pass this link about running kubernetes cluster running on Cilium. The interesting point is the high throughput is achieved by BIG TCP and IPv6!

The summary (copied) is:

TCP segments in the OS are up to 65K, NIC hardware does the segmentation – we do this now, but the 65K is a limitation of IPv4 addressing.  BIG TCP uses IPv6 and allows much large TCP segments within OS currently 512K but theoretically higher.  End result – better perf (>20% higher in this video) and latency (2.2x faster through the OS).

Then I saw this other video from John Ousterhout. It is similar topic as the Kubernetes video above as K8S is used mainly in datacenters.

High performance:
– data throughput: full link speed for large messages
– low tail latency: <10us for short messages? (DC)
– message throughput: 100M short messages per second? (DC)

TCP issues in DC:
1- stream oriented (no load balancing) -> message based
2- connection oriented (can break infiniband!, expensive,)-> connectionless
3- fair scheduling (bw sharing) -> run to completion (SRPT)
4- sender-driven congestion control (based on buffer occupancy) -> receiver- driven congestion control
5- in-order delivery -> no ordering requirements

As well, it is important the move to NIC (as there is already a lot of NIC offloading).

His proposal for HOMA looks very nice but I like how he explains how dificult is going to be successful. Still worth trying.