Friction, Morning Routine, RoCe Meta Paper, AWS RNG, MRC, Slurm, Rail-Optimize, 800VDC, Phyllo, Approach

Manson AI: You need friction

AI Agent: narrow focus: goal, proof, steps

Bear Grylls’ Morning Routine: Cold (never get used to), bared foot, strength training, 30 minutes

RoCE networks for distributed AI training at scale: I have managed to read the paper ! Although in the AI word, two years is an eternity, I think it is still interesting.

1) Network Topology: backend network only for GPUs (RDMA nics), non-blocking. Frontend network: data ingestion, checkpointing, logging.

Pod = AI zone
 leaf = RTSW, DAC cables, shallow buffer
 spine = CTSW, deep buffers. fiber between leaf-spine.
SuperSpine = ATSW, oversubscribed, connect AI zones

intra-node -> nvlink
ROCE: cpu offloading, ethernet (standard)

collective communication library serves as the sw abstraction between training workloads and the NIC
                                 schedules verbs calls over QP (Queue Pairs)
 parallelism strategy determines collective: allreduce, allgather, alltoall
 choice logical topology:

------------------

2) Routing: work load. low entropy flows (few flows) -> ECMP bad (5-tuple udp: src/dst ip, src/dst port, protocol), burstiness, elephant flows
--

 RTSW uplinks 1:2 under-subscribed! -> expensive (short-term)
 1) QP scaling: use destination QP of Roce packet using the UDF capability in switch to increase entropy -> Enhanced ECMP -> short-term
 2) Central TE controller -> long-term: CP real-time topology end-to-end cluster, 
                                        flow matrix (flow bps) + CSPF (constrained SPF)
                                        write in switches dataplane
                                     DP: TE overrides default BGP routing policy in leaf. Use Exact Match table.
                           Not good with multiple link failures. Doesnt scale 
 3) Flowlet switching: try to improve 1 and 2. hw assistant schema. put packets in different ports in ECMP
     out-of-order: move packets only after 1/2 RTT
     load-aware path assignment: better than TE

------------------
                     
3) Transport: congestion management. Start with DCQCN. packet drops on ACK/NACK can cause prolonged Local ACK timeout (LAT)
--
 Tuning DCQCN not great (strict ECN -> minimize PFC (can lead to head-of-line blocking)

 200G, we stayed with relaxed ECN marking, allowing for buffer build up in the CTSW, while keeping default DCQCN settings.
 400G We proceeded without DCQCN. just PFC for flow control
 re-design collective library: two-stage copy

------------------

4) Operations:
 Change QoS priority of Clear to Send (CTS) messages. In RTSW ASIC, modify dsCP marking for ACK  messages
 Tuning VOQ in CTSW
 obeservability: OOS: out of seq.
                 Link flaps
                 Local ACK timeouts (LAT)
                 PFC watchdog: catch any long-duration PFC pause (>200ms)
                 buffer utilization RTSW
                 reachibility (pings)
                 constant latency monitoring loaded and unloaded (catch regressions)
                 base lines!!!

Perplexity: Hosting Qwen on Blackwel:

AWS RNG – Random Graph Network: The paper is totally out of my space, but the concept looks brutal. With an operations hat, how you troubleshot it? (ping, traceroute, link congestion, data flows patterns, etc)

MRC1 and MRC2 (OCI): Why we need planes (breakouts) and not just a big plane.

As SerDes speeds continue increasing, every microsecond of congestion creates much larger pressure inside the fabric. A 100G transport domain may be manageable. A 400G domain amplifies the same congestion into roughly 4x pressure. An 800G domain, and eventually a 1.6T domain, becomes much harder to coordinate.

This pressure appears as larger switch buffer requirements, larger congestion domains, harder retransmission coordination, larger cache pressure, larger synchronization storms, and harder thermal and power scaling inside ASICs.

At hyperscale, switch ASIC cache and transport coordination become fundamental scaling bottlenecks. Increasing switch buffer size is extremely difficult: high-speed SRAM is expensive, larger cache arrays consume significant power, thermal density rises quickly, die area scaling becomes inefficient, and routing complexity increases dramatically.

Splitting transport into many smaller lanes naturally reduces these pressures. Reliability improvements then emerge as a byproduct, because congestion, retransmission, and buffering become more distributed.

THE QUESTION: which breakout keeps the fabric at the shallowest practical Clos depth while keeping plane count and operations manageable? -> less hops, less switches, less latency

Slurm: I like the “Slurm vs. Kubernetes”

Slurm Workload Manager (short for Simple Linux Utility for Resource Management) has become a cornerstone of large-scale computing. Originally created in the early 2000s to support large-scale high-performance computing (HPC) environments, Slurm is now widely recognized as the de facto scheduler for HPC clusters. Today, it orchestrates jobs across thousands of servers and GPUs in some of the world鈥檚 most advanced computing environments. 

Interview Question: 512 GPU, non-blocking (full bisection) and 2xUFM! I really liked this. I think for once I understand the rail-optimize (fat-tree = leaf-spine). Just break one leaf-spine link, beautiful!!!

800VDC: Next step in electrical infra in DC space.

Phyllo by hand

Approach woman: curiosity and no performance. Practice. Be at peace with uncomfortable and akwardness. Rejection as learning

CPX, CXL, Tier1, Ironwood, BGP Origin, Temporal, Alibaba, Helios, F1 Croissant, Videos

CPX and Meta’s CPO report

CXL: fighting the memory wall

Tier1 Analysis

Ironwood TPUv7: just worth taking a look at the pictures of the racks

Rewrite BGP Origin: never thought about this action, for wining in the BGP selection process….. and having a lower router-id gives you too an edge!!! (fee people has access to 2/8! … and you dont have to own the router-id range… you can’t check it!!!) RIPE91

Resilient Network Automation: I saw temporal.io in Autcon3 but never paid attention. repo

Alibaba 1.6T OSFP: insane

META AMD Helios rack

F1 Croissant: I am pretty sure I tried that croissant the time I went to Melbourne. It was the strangest bakery ever…. you buy the things like you are dealing with the drug dealer… the croissant was very good, but I remember it was the most expensive croissant ever too! From the video, it is incredible that they don’t use any extra flour for dusting!

Videos

  • Morgan Housel – The Art of Spending Money: A lot of psychology about money.
  • Oz Pearlman: I was amazed how uncomfortable was the presenter when Oz read him.
  • Vinh Glang: The art of communication – I need to watch this several times (for interviews)

DeepSeek, AWS HPC SDR vs Multipath-TCP, OCSP death, AlphaChip, Visual AI agents, Ollama, Local AI, Bob Bowman

Nice analysis about DeepSeek without hype.

AWS HPC: Didn’t know AWS offered HPC services (articule from 2021). I liked to find more details about SDR: Multipath LB, Out of Order delivery, congestion control similar to BBR. I wonder, this is not the same as UltraEthernet consortium is trying to achieve?

Multipath-tcp: The above probably works in “close” networks (managed by one entity) but maybe it is not going to work in the Wild internet. Still this looks still quite far from production. I believe this like QUID. Somebody like google deploys it and the rest jump in the wagon (more or less)

OCSP death: “OCSP is not making anyone more secure. Browsers are either not checking it or are implementing it in a way that provides no security benefits.聽“

AlphaChip: As far as I have read, designing chip is one of the most complex things and getting help from AI can even increase the advances in chip design. I read that NVIDIA had something similar. And this should be applied to ASICs too so networking is benefited

Vision Agent AskUI: need to try

ByteDance UI Agent – UI-TARS: as above

Crawl4AI: Interesting for digestion your local knowledge base sites and using with your local LLM….

Run your locally AI: I tried this in my work MacBook and it worked! I want to create an AI agent for a work project (actually i am dreaming to be able to achieve it….)

Open Web UI + Ollama: I tested this too in my MacBook and works like magic! You can even use DeepSeek 馃檪

Bolt.diy + DeepSeek: I didnt manage to install bolt.diy ….

Training your AI: My idea is to get an open-source LLM trained with my data so I can use it to do my “job” But in the video there was too much publicity and I dont have access to a GPU… but I dont much data neither (or that’s what I think)

Bob Bowman (Michael Phelps coach): Show up, do the job.

AusNOG 2024

A bit late to review, but some interesting talks. Agenda.

Arista: Practical AI Networking Innovations:

rail optimized
all reduce

low entropy, 2-3 flows per nic, elephant, bursty

JCT job completion time
TSN time spent networking

overprovisioning 1:1.2

Nokia: NUTS python network testing:

It looks quite nice, it is based on pytest and nornir/napalm. It looks similar to batfish?

Measuring Starlink:

High jitter, each 15sec change satellite -> micro-loss. BBR (non-loss sensitive) is the best flow protocol protocol with Starlink. ECN.

Quantum AI Chip, InfraHub, python UV, SR controller, Kobe Bryant, Hell

Google Quantum AI: This looks remarkable.

python vu: replacement for pip, pyenv, etc. Need to try

InfraHub: As a network engineer interested in Automation. This looks interesting and I would like to go deeper to fully understand as it is the merge of the typical source of truth (DB) that you can’t get in git.

Segment Routing Controller: This is another thing I played with some years ago, but never found a controller to make TE. I dont see clearly this software is OSS but at least doesnt look like is a vendor-lock…

Kobe Bryant: venting, and it is ok.

Jordan B Peterson: Hell

LLM n C, 1.6nm, xz vul, turing 2024, let’s encrypt, chatdev, Ethernet vs IB, Slingshot, Tailscale ssh, videos, 42 rules, CNI, Cilium

Origins of deep learning: interesting post. At the beginning all was Matlab and CPU bounded. repo

LLM in C: post and repo.

A16: 1.6nm process for 2026. More frequency, less power.

xz vulnerability repo: Something I need to check in the VP

Turing Award 2024: zero-knowledge-proof.

Cloudflare and Let’s Encrypt’s certificate change: I haven’t heard of this until recently. I use Let’s Encrypt so as far as I can read, makes sense what they are doing. But didnt know 2% Cloudflare customer were using the “cert”

ChatDev: Communicate agents for software development. I am a not a developer but I would use this just as a starting point If I have any idea for a project. I would remove the C-suite agents, at least for low level projects.

IB vs Ethernet: A bit of bias here (the author is from Broadcom -> Ethernet). I have no hands-on experience with IB, but I have read the cables are not cheap… Let’s see when UltraEthernet gets into the market. Another view.

Slingshot and Juniper: A bit of bias again as HP bought Juniper. So how will these interconnects fade inside the company? As far as I know, most supercomputers use some “special” interconnect so not much ethernet there. But the money nowadays is in AI infra… Paper for slingshot (haven’t read it)

Tailscale SSH, wireguard throughput: These are things I should a spend a bit of time one day and consider if I should use them (I dont like it is not opensource though). This netmaker?

Videos:

Jocko Willink: Discipline = Freedom. Remember but not dwell. Good leader, delegate. Be a man -> take action, bonding (pick your activity)

Jimmy Carr: Imposter syndrome each 18 months, so you have to stand-up. People crave the success not the journey. Teaching comedy good for communicating.

Sam Altman – Stanford 2024: First time I see him talking. It has some funny moments. More powerful computers. I missed a question about opensource LLM and closed ones.

Find a girlfriend: I know just a little bit about the person (I want to read one of his books) from other books and videos. I would think he would have already a girlfriend or family. From the three methods, definitely, the face to face approach in the street looks so much better (and that’s what I would like to do)

Jordan Peterson original 42 rules

CNI performance: I have used kubernetes since I studied for CKAD but still I am interested in the networks side. I didn’t know about Kube-router and it did great! I am bit surprised with Calico as I have read more and more about Cilium.

Cilium for network engineers. I have to read this fully (worried that Cisco bought it…)

LaVague, S3, Stratego

LaVague: There are web services that dont have API so this could help me to automate the interaction with them? I need to test. Another question, i am not sure if lavague has an API itself!

S3: I had this in my to-read list for a long time… and I after reading today I was a bit surprised because it wasn’t really technical as I expected. The takeouts are: Durability reviews, lightweight formal verification and ownership.

Stratego: I have never played this game but I was surprised that is more “complex” than chess and go. And how DeepNash can bluff and do unexpected things.

Life, Love, Sex, Negative Beliefs, startup regrets, nanog90, Groq LPU, LLM from scratch, ssh3, eBFP BGP, RPKI, TIANHE-3

I hit rock bottom this week. I hope I finally closed one door in my life so I give myself the chance to open others. Made the wrong decision? It is easy when you look back. Do I regret it? The most annoying thing is these are failures so you can’t go back and recover. But I was so bloody newbie!!!…. At least after 5 years…

“For every reason it’s not possible, there are hundreds of people who have faced the same circumstances and succeeded.” Jack Canfield

Head down, crying, cursing, whatever, but forwards. As it has always been.

—-

Somehow managed to list to long videos, something I normally can’t manage (because lack of time, etc)

Negative Beliefs, avoid bitterness, aim for greatness (remarkable things), scape the darkness: Jordan B Peterson with Modern Wisdom: video, podcast.

Find and keep Love: video. 1st Get your shit together. Communication is critical. Be careful with your shopping list….

Good Sex: video. Communicate….

Orgasm: video. Haven’t seen it completely yet but very interesting. Use your tongue wisely.

— Other things:

Startup decisions and regrets: page. Interesting. I think most of things are very specific but still good to read.

Nanog90: agenda I didnt want the videos but I reviewed several pdfs and these ones look interesting:

Abstract Ponderings: A ten-year retrospective. Rob Shakir – Google: video

https://rob.sh/post/reimagining-network-devices/
https://rob.sh/post/coaching/
https://cdn.rob.sh/files/the-next-spring-forward_2018.pdf
https://research.google/research-areas/networking/

AI Data Center networks – Juniper – video

Using gNOI capabilities to simplify software upgrade use case: video – I had to idea about gNOI so looks interesting. It is crazy that still in XXI, automating a network device is so painful. Thanks to all vendors to make your life miserable.

Go lang for network engineers: video slides– I always thought that Golang had a massive potential for network automation but there was always lack of support and python is the king. So nice to see that Arista has things to offer.

PTP in Meta: video and blog.

There are more things, but havent had the chance to review them.

—-

It looks there is new chatbot that is not using the standard NVIDIA GPU. Groq uses LPU (Language Processing Unit). And they say it is better than a GPU. They have this paper but I can’t really see feature of that LPU.

Slurp’it: Show this blog, and the product looks interesting but although is free, it is not opensource and at the end of they you dont want a new vendor-lockin

Container lab in kubernetes: Clabernetes. I would like to play with this one day.

NetDev0x17: videos and sessions. link This is quite low details and most of the time beyond my knowledge. Again, something to take a look at some point.

LLM from scratch: repo. Looks very interesting. But the book it is going to take a long time to hit the market.

ssh3: repo. Interesting experiment.

eBFP and BGP: blog. Really interesting. Another thing that always wanted to play with.

Orange RPKI: old news but still interesting to see how much damaged can cause RPKI in the wrong hands…

China TIANHE-3 Supercomputer: Very interesting. Link.

AusNOG 2023

Nice NOG meeting:

Vendor Support API: Interesting how Telstra uses Juniper TAC API to handle power supplies replacement. I was surprised that they are able to get the RMA and just try to replace it. If they dont need it, they send it back… That saves time to Telstra for sure. The problem I can see here is when you need to open ticket for inbound/outbound deliveries in the datacenters, that dont have any API at all. If datacenters and big courier companies had API as 1st class citizends, incredible things could happens. Still just being able to have zero-touch replacement for power supplies is a start.

No Packet Behind – AWS: I think until pass the first 30 minutes, there is nothing new that hasnt been published in other NOG meeting between 2022 and 2023. At least the mention the name of the latest fabric, Final Cat. As well, they mention issues with IPv6 deployment.

There are other interesting talks but without video so the pdf only doesnt really give me much (like the AWS live premium talk)

Curl, Yaml, scalars, Elixir, git stash

I haven’t watched this video, but looks like the holly book of curl!!!

I'd recommend starting at ~34 minutes.

路You can specify multiple URLS with multiple output options in a single command. Doing this or using globbing (see below) to the same host will use persistent connections and greatly improve performance because the same L5 session is used

路trurl is also made by the project and allows you to programmatically manipulate URLs (change server, path, query parameters, etc.). Pretty neat: https://github.com/curl/trurl

路curl supports URL globbing: curl https://{ftp,www,test}.example.com/img[1-22].jpg -o "foo_#2_#1.jpg"

路By default, curl will resolve requests serially when multiple URLS or globbing is specified, but curl is capable of doing parallel transfers with the -Z or --parallel option. And can do anywhere from 2-300 transfers in parallel. This also has the potential to parallel-ize HTTP/3 transfers even from single URLs.

路You can do curl --help category to get a list of help categories for narrowing down options by categories like http or output

路 Long commands for curl can be specified in a file and given to curl either via stdin or -K / --config - These files are essentially just command lines in a file

路You can use the --trace option to provide tcpdump type output from curl. Saving the need to to start tcpdump in the background if you just want to see what's happening from curl

路You can use --connect-to to specify a different DNS name to go to (instead of the one specified in the URL) which is similar to the --resolve option, but doesn't require the user to lookup the IP address ahead of time

路You can override the DNS server that you use to resolve URLs via --dns-ipv4-addr 8.8.8.8 for example

路You can add --libcurl to any curl command and it will spit out C source-code that implements the same command line in C via the library libcurl

路You can set the environment variable SSLKEYLOGFILE to a file name and it will save the runtime TLS secrets to that file, and use that file in WireShark along with a dump of the traffic from tcpdump to see the contents of encrypted HTTP streams

路You can choose to only download files that have changed since the last time they were downloaded with curl via --etag-save <etag_file> and --etag-compare <etag_file>

路You can skip adding the extra -H "Content-Type: application/json" when getting or posting JSON data (with -d), by specifying --json instead of just -d

路You can create JSON easily from the command line with the tool jo: https://github.com/jpmens/jo (basically a reverse jq)

Rant about yaml. And something I learned about yaml some months ago and forgot about it: scalars for making multiline work in yaml.

Elixir: a programming language based on Erlang. Really impressive reports! But still I would like to learn golang (if I ever learn properly python 馃檪

git stash: I didnt know about this git command until last week, very handy.