Tesla TCP, Cerebras Inference, Leopold AIG race, Cursor + Sonnet, AI AWS Engineering Infra, NVLink HGX B200 and UALink, Netflix encoding challenges, Food waste snacks, career advice AWS, Thick Skin

Testa TCP replacement: Instead of buying and spending a lot of money, built what you need. I assume very smart people around and real network engineering taking place.It is like a re-write of TCP but doesnt break it so your switches can still play with it. It seems videos are not available in the hotchips webpage yet. And this link looks even better, even mentions Arista as the switching vendor.

Cerebras Inference: From hotchips 2024. I am still blow away for the waferscale solution. Obviously, the presentation says its product is the best but I wonder, can you install a “standard” linux and run your LLM/Inference that easily?

Leopold AIG race: Via linkedin, then the source. I read the chapter 3 regarding the race to the Trillion-Dollar cluster. It all looks Sci-Fi, but I think it may be not that far from reallity.

Cursor + Sonet: Replacement for copilot? original I haven’t used Copilot but at some point I would like to get into the wagon and try things and decide for myself.

AI AWS Engineering Infra: low-latency and large-scale networking (\o/), energy efficiency, security, AI chips.

NVLink HGX B200: To be honest, I always forger the concept of NVLink and I told my self it is an “in-server” switch to connect all GPUs in a rack. Still this can help:

At a high level, the consortium’s goal (UltraEthernet/ UA) is to develop an open standard alternative to Nvidia’s NVLInk that can be used for intra-server or inter-server high-speed connectivity between GPU/Accelerators to build scale-up AI/HPC systems. The plan is to use AMD’s interconnect (Infinity Fabric) as the baseline for this standard.

Netflix encoding challenges: From encoding per quality of connection, to per-title, to per-shot. Still there are challenges for live streaming. Amazon does already live streaming for sports, have they “solved” the problem? I dont use Netflix or similar but still, the challenges and engineering behind is quite interesting.

Food Waste snacks: Indeed, we need more of this.

Some career advice from AWS: I “get” the point but still you want to be up to speed (at certain level) with new technologies, you dont want to become a dinosaur (ATM, frame-relay, pascal, etc).

Again, it’s not about how much you technically know but how you put into use what you know to generate amazing results for a value chain.

Get the data – be a data-driven nerd if you will – define a problem statement, demonstrate how your solution translates to real value, and fix it.

Thick Skin:

“Not taking things personally is a superpower.” –James Clear

Because “no” is normal.

Before John Paul DeJoria built his billion-dollar empire with Patrón and hair products, he hustled door-to-door selling encyclopedias. His wisdom shared at Stanford Business School on embracing rejection is pure gold (start clip at 5:06).

You see, life is a numbers game. Today’s winners often got rejected the most (but persevered). They kept taking smart shots on goal and, eventually, broke through.

Cloudflare backbone 2024, Cisco AI, Leetcode, Alibaba HPN, Altman UBI, xAI 100k GPU, Crowdstrike RCA, Github deleted data, DGX SuperPod, how ssh works, Grace Hooper Nvidia

Cloudflare backbone 2024: Everything very high level. 500% backbone capacity increase since 2021. Use of MPLS + SR-TE. Would be interesting to see how the operate/automate those many PoPs.

Cisco AI: “three of the top four hyperscalers deploying our Ethernet AI fabric” I assume it is Google, Microsoft and Meta? AWS is the forth and biggest.

Huawei Cloud Monitor: Haven’t read the paper RD-Probe. I would expect a git repo with the code 🙂 And refers to AWS pdf and video.

Automated Leetcode: One day, I should have time to use it a learn more programming, although AI can solve them quicker than me 🙂

Alibaba Cloud HPN: linkedin, paper, AIDC material

LLM Traffic Pattern: periodically burst flows, few flows (LB harder)

Sensitive to failures: GPU, link, switch, etc

Limitations of Traditional Clos: ECMP (hash polarization) and SPOF in TORs

HPN goals:

-Scalability: up to 100k GPU

-Performance: low latency (minimum amount of hops) and maximum network utilization

-Reliability: Use two TORs with LACP from the host.

Tier1

– Use single-chip switch 51.2Tbps. They are more reliable. Dual TOR

– 1k GPUs in a segment (like nv-link) Rail-optimized network

Tier2: Eliminating load imbalance: Using dual plane. It has oversubscription

Tier3: connects several pod. Can reach 100k GPUs. Independent front-end network

Altman Universal Base Income Study: It doesnt fixt all problems, but in my opinion, it helps, and it is a good direction.

xAI 100k GPU cluster: 100k liquid-cooled H100s on single RDMA fabric. Looks like Supermicro involved for servers and Juniper only front-end network. NVIDIA provides all ethernet switches with Spectrum-4. Very interesting. Confirmation from NVIDIA (Spectrum used = Ethernet). More details with a video.

Crowdstrike RCA:

Github access deleted data: Didn’t know about it. Interesting and scary.

Nvidia DGX SuperPod: reference architecture. video. 1 pod is 16 racks with 4 DGX each (128×8=1024 GPU per pod), 2xIB fabric: compute + storage, fat tree, rail-optimized, liquid cooling. 32k GPU fills a DC.

How SSH works: So powerful, and I am still so clueless about it

Chips and Cheese GH200: Nice analysis for Nvidia Grace CPU (ARM Neoverse) and Hopper H100 GPU

Find Love

I decided to read this book after watching this video some months ago. As I am not able to make a move in my dating live… and it is nearly 6y. I know I am not going to discover the grail of dating but at least I can try to refresh ideas, find encouragement, you name it, to start the work.

The book is crystal clear. Get your shit together, know yourself, know what you want, know what you dont want, dont fall in some traps, etc.

One of the things that I have collected and stays with me so far, it is the importance of having a “tribe” a.k.a a social network. And maybe this is not the most important point in the book. Still, I have a very small tribe, they are few but they are the best. So I have to work in increase my social network, and that is not just good for dating.

Something I have been doing in the last months is going to Bachata social dancing on Wednesdays after class and Saturday nights. It is hard for me. It is getting out our my confort zone. But this is the only way to improve, and it is not just improving my bachata skills. It is being comfortable being uncomfortable, knowing that you may be rejected when asking for a dance, or dancing once with a person and that person will not want to dance with you (because I am not a good dancer) again. But step by step (literally) is getting a bit better. Still long way to go, but I must carry on. Sometimes I talk to people so it is good. I feel less weird in those moments because you are coming on your own (and I am not the only one) and looks like everybody is having a good time and socialising.

Chapter 1

Identify and understand your attachment style: I am fearful-avoidant. But I would like to be Secure.

The village/tribe concept: Until not long ago (maybe when online dating started), your dating pool was around your social circle. It can be a tribe, a village, your neighbourhood, work, sport, etc. And your close members of that tribe will want the best for your.

Be clear about your goal in the relationship: short-term, long-term, family, etc

Chapter 2

You have to know yourself, that means work on you and go through your traumas, problems, etc and heal. Then you can start dating properly, as you will have a better vision (less noise). Have an open mind and be a lifelong learner to be your best self.

Soulmates are made not found.

Chapter 3

1st Be happy with yourself + self awareness -> successful relationship

=>

Good relationship -> makes you happier/healthier

Chapter 4

We are living in a changing world so we have to adapt and find the best approach to find our partner. And that includes online dating

Chapter 5

This is basic statistics and probability. The more people you can meet, the more chances you will have to find a partner. This is your job. And way to increase your social reach is using “those” weak connections (a friend you dont see often, a place, etc)

Chapter 6

Say what you want, be intentional. First impressions are important so make a strong one (I am going to struggle here). Work in your “social capital”: identify the things in your life you are passionate about and work towards becoming exceptional at them, and the important thing, it is not the goal, it is the journey.

Chapter 7: Green Flags

What you want and what you need are not the same. This applies to dating too. So make rational decisions about the criteria for your partner. There is a reference to Gary Chapman’s “The 5 Love Languages“. It is important to identify them, and they dont have to match. But they are not all.

Green Flags:

  • Emotional Fitness
  • Courageous Vision:
  • Resilient Resourcefulness:
  • Open-minded Understanding:
  • Compassionate Support

Summary here: Choose someone who matches your values, who will challenge you to grow and has the character to be for the good and bad times.

Chapter 8: Red Flags

Value your self-worth and refuse to settle for anything less than you deserve. That means you have to be very honest with your self and do the work.

Read Flags: Narcissism, Psychopathy, Machiavellianism and Sadism

Chapter 9: Commitment

Four elements of commitment readiness: trust, effective conflict resolution, high relationship satisfaction and not thinking there are better options available (you will be forever in the “game”)

Three things to make a relationship to work in the long term:

  • No Defensiveness, No Stonewalling, No Criticism, No Contempt
  • Relationship equity
  • Feeling your are becoming better, your best self

And you need to talk about all the above with your partner -> that means assertiveness and courage.