Chips wars

I was reading this news recently about Huaewi capability of building 14nm chips.

Today, the EDA (Electronic Design Automation) market is largely controlled by three companies: California-based Synopsys and Cadence, as well as Germany's Siemens. 

I wonder, are the export controls good long term? Maybe the solution is worse than the illness…

And based on that, I learned that ASML is the biggest valued company in Europe!

As well, there is a book about “chip wars” that I want to read at some point.

Python: cycle and setattr

This week a colleague refactor a script that I wrote and the end was pretty different from what I did. The logic was the same but the code was very differnt. I learnt two things about python

setattr: This method provides an alternate means to assign values to class variables, in addition to constructors and object functions. It comes in handy when we want to add a new attribute to an object and assigning it a value
cycle: . This function takes an iterable inputs as an argument and returns an infinite iterator over the values in inputs that returns to the beginning once the end of inputs is reached. 

CXL

In one meeting somebody mentioned CXL, and I had to look it up.

Interesting:

Eventually CXL it is expected to be an all-encompassing cache-coherent interface for connecting any number of CPUs, memory, process accelerators (notably FPGAs and GPUs), and other peripherals.

BFD Multihop

BFD is a protocol that I assumed I knew “well” as it is quite straightforward…. But after having to check how to configure BFD multihop works I notices, I had actually no idea. As usual, I need to read the RFC at some point.

From this link, I noticed that the concept of Hellos and Echo…. and that echo uses the same IP as src and destination…. I really like the wireshark captures.

Copy/Paste from the link

Packet Types

Control Packets

Control packets are used to establish BFD peerings. Essential information are included within these packets, to include flags for things such as authentication, in addition to the timer negotiations. 
These packets are send via UDP to the destination of the far side IP, utilizing the bfd-control port of 3784. 
Because these packets must actually be processed by the peer, they are sent less frequently then the actual BFD echos used for sub second failure detection.


Echo Packets

BFD echo packets are essentially for local use. They are sent with the same source, and destination IP of itself, destined for the UDP bfd-echo port of 3785. When an echo packet is received, because the destination IP is not of the router receiving it, it simple forwards it out of the appropriate interface, ridding the need to punt it up to the processor.
Because the source and destination IP are the local router, BFD can be ran asynchronous. As in, you can set up a single side to utilize BFD echo detection, while the other side merely maintains a BFD neighbor relationship through control packets.

And now about the BFD multihop. It is a short read, and main point is the UDP port is 4784 compared with 3784 in single-hop.

Then checking for the specific details for configuring BFD MH in NX-OS, it is better to check the official documentation. That for example confirms “Echo mode is not supported for multihop BFD.”

Another thing to take into account is the COPP. You need to check if your device OS captures BFD in the CoPP policies as multi-hop goes to CPU. As well, check if there is any other hardware configuration required.

Another thing that bites me is that when testing this in a software lab, BFD is always down but at least the routing protocols come up.

NVIDIA GTC March 2023

I watched this week some interesting videos from NVIDA GTC related to networking. And it is a pain that you need to use a “work” email to register….

  • S51839 – Designing the Next Generation of AI Systems:

— A quick summary, it seems any HPC networks needs to use InfiniBand… NVIDA has solution for all sizes. They can provide a POD solution!!! All Cloud providers provide their services.

  • S51112 – How to Design an AI Supercomputer for Fast Distributed Training, and its Use Cases:

— Very interesting talk from NEC Japan. They built a network based on Ethernet switches for HPC with GPUs (and not IB as seen in the other video). As well they are heavy in RDMA/ROCEv2. And seems they have dedicated ports in the network for storage, management, etc. They are very happy with Cumulus/Linux as NOS.

  • S51339 – Hit the Ground Running with Data Center Digital Twin Automation:

— Interesting tool NVIDIA air for creating labs. I expected in the demonstration to show off and built a huge network. “Digital Twin” looks like the new buzzword in the network automation world.

  • S51751 – Powering Telco Cloud Services with Open Accelerated Ethernet:

— This is from COMCAST. And it is very interesting how “big” looks like SONIC is becoming. And NVIDIA is the second contributor to SONIC after M$! I need to try SONIC at some point.

  • S51204 – Transforming Clouds to Cloud-Native Supercomputing Best Practices with Microsoft Azure:

— Obviously, building NVIDIA based supercomputers in M$ Azure. Again, all infiniband.

And another thing, the Spectrum-4 switch looks insane.

AWS Networking Videos – March 2023

I watched very interesting videos about AWS networking. They are high level, so they dont tell you the magic sauce you would like to know but it is nice that this info is out in the public.

  • DKNOG – How AWS is evolving its peering-edge in 2023 and onwards link + event:

— Evolution from buying chassis to building your own devices: consume -> create (NOC-less, auto-remediation, active telemetry, etc)-> innovate (freedom to examine trade-offs, 1U devices). Clearly use of “Clos” networks and they linux-based software.

— Delighted: low complexity + high innovation

— Simplicity Scales

— It is interesting the view of a router/brick like a set of 1U devices (rack 102.8T – 200x400G ports for customers, non-blocking). An it is very good they have pictures of the concept of “bricks” and “spines”.

— Challenges with cabling (SN connector — no patching rack needed) and 400G ZR+ (heating!)

— BGP peering is actually with a container:

— James Hamilton paper – link + pdf

  • AWS re:Invent 2022 – Dive deep on AWS networking infrastructure (NET402)– link

— summary: This is “similar” to the DKNOG but with longer and some other details like:

— “We dont like chassis”. 1+million devices

— SDR at NIC level so one TCP flow is actually load-balanced in several paths

— Hybrid SDN approach: You have controllers to give you a big picture view (I guess it provides the visibility to say “just send 70% traffic to this device” – but not sure how) and the own device device capability to deal with changes.

— Telemetry, continuous monitoring, triangulation: Be able to detect the port/device is causing the problem.

  • AWS re:Invent 2022 – Leaping ahead: The power of cloud network innovation (NET211-L) – link:

— AWS Global Infrastructure: Backbone capacity

— Customer SW/HW

— Everything fails all the time

— GPS locations in fibers! + inject light in fiber to double check fault -> intelligent optical routing/failover -> better than BGP….

— Termite sheet fibers for Australia 🙂

— Nitro card = NIC (offload card)

— SDR: not need in-order packet deliver as required by TCP. 25Gbps flows allowed now.

Dune6: Chapter House

This is the last book from the original Dune series. Thinking of the first books, looks like a different world although there are references to the beginning. The end is quite open so you can think this is can continue in more books as with the time there is always a new plot/drama. Although there are many things I dont understand. The couple of “Face Dancers” in contact with Duncan? I like the references to Van Gogh pictures like “Thatched Cottages at Cordeville“.

I would never think that Duncan would be present in all books and being a critical character.

But at the end, all is about love and the repression of it like Bene Gesserit do.

And quite moving the last words from the author about his wife death.

Bolo Podre

This is a typical Southern Portugal cake that I tried two Christmas and quite liked as looks simple and tasty. So finally this weekend was the time to try this recipe.

Ingredients:

  • 2 medium eggs
  • 100g sugar
  • 100g honey
  • 75ml olive oil
  • 150g plain flour
  • 1/2 tbs cinnamon
  • 1/2 tbs baking powder
  • 1/2 ground anise grains.
  • A bit of butter

Process:

  • Pre-heat oven at 180C
  • Whisk eggs and sugar until double in volume. It takes time doing it by hand!
  • Add honey and oil, bit a bit and keep whisking.
  • Add flour, cinnamon, baking powder and anise. Mix all well
  • Pour the mix in a greased tin.
  • Bake for 45 minutes aprox. Use a tooth stick or similar to check the inside is cook before removing from oven.

Result:

It raised more than I expected. It wasn’t as moist as I remember but it was tasty. Based on the ingredients reminded me of a olive-Greek cake I did some months ago.

I checked other recipes like this so maybe I will try again at some point.

Wafer-Scale

Somehow I came across this company that provides some crazy numbers in just one rack. Then again nearly by coincidence I show this news from an email that mentioned “cerebras” and wafer-scale, a term that I have never heard about it. So found some info in wikipedia and all looks like amazing. As well, I have learned about Gene Amdahl as he was the first one trying wafer-scale integration and his law. Didnt know he was such a figure in the computer architecture history.