Infiniband Essentials

NVIDIA provides this course for free. Although I surprised that there is no much “free” documentation about this technology. I wish they follow the same path as most networking vendors where they want you to learn their technology without much barriers. And it is quite pathetic that you can’t really find books about it…

The course is very very high level and very very short. So I didnt become an Infiniband CCIE…

  • Intro to IB

— Elements of IB: IB switch, Subnet Manager (it is like a SDN controller), hosts (clients), adaptors (NICs), gateways (convert IB <> Ethernet) and IB routers.

  • Key features

— Simplify mgmt: because of the Subnet Manager

— High bw: up to 400G)

— Cpu offload: RDMA, bypass OS.

— Ultra low latency: 1us host to host.

— Network scale-out: 48k nodes in a single subnet. You can connect subnets using IB router.

— QoS: achieve loss-less flows.

— Fabric resilience: Fast-ReRouting at switch level takes 1ms compared with 5s using Traffic Manager => Self-Healing

— Optimal load-balancing: using AR (adaptive routing). Rebalance packets and flows.

–MPI super performance (SHARP – scalable hierarchical aggregation and reduction protocol): off-load operations from cpu/gpu to switches -> decrease the retransmissions from end hosts -> less data sent. Dont really understand this.

— Variety of supported topologies: fat-tree, dragonfly+, torus, hypercurve and hyperx.

  • Architecture:

— Similar layers as OSI model: application, transport, network, link and physical.

— In IB, applications connect to NIC, bypass OS.

— Upper layer protocol:

— MPI: Message Passing Interface

— NCCL: NVIDIA Collective Communication Library

— iSEB: RDMA storage protocols.

— IPoIB: IP over IB

— Transport Layer: diff from tcp/ip, it creates an end-to-end virtual channel between applications (source and destination), bypassing OS in both ends.

— Network Layer: This is mainly at IB routers to connect IB subnets. Routers use GID as identifier for source and destinations.

— Link Layer: each node is identified by a LID (local ID), managed by the Subnet Manager. Switch has a forwarding table with “port vs LID” <- generated by Subnet Manager. You have flow-control for providing loss-less connections.

— Physical Layer: Support for copper (DAC) and optical (AOC) connectors.

AI Supercomputer – NVLink

So NVIDIA has an AI supercomputer via this. Meta, Google and MS making comments about it. And based on this, it is a 24 racks setup using 900GBps NVLink-C2C interface, so no ethernet and no infiniband. Here, there is a bit more info about NVLink:

NVLink Switch System forms a two-level, non-blocking, fat-tree NVLink fabric to fully connect 256 Grace Hopper Superchips in a DGX GH200 system. Every GPU in DGX GH200 can access the memory of other GPUs and extended GPU memory of all NVIDIA Grace CPUs at 900 GBps. 

This is the official page for NVlink but only with the above I understood this is like a “new” switching infrastructure.

But looks like if you want to connect up those supercomputers, you need to use infiniband. And again power/cooling is a important subject.

Jamaican Rum Cake

I have been lucky to try some Jamaican Rum Cake brought from Jamaica so I decided if I could make it myself. I found some recipes online like this (my main source) and this.

Ingredients:

  • 200g butter at room temperature + a bit for greasing
  • 1 cup of brown sugar
  • 4 eggs
  • 1 tbsp lime juice
  • 1 tbsp lime zest
  • 1 cup of blended fruits: raisins, cherries, mixed fruit, etc. Pre-soak the fruits earlier with water and a bit of white rum.
  • 1 tsp vanilla paste
  • 1 tbsp almond liquor
  • 1/4 cup of white rum + a bit for brushing
  • 1/2 cup of Port wine (I dont have red label / sweet red wine)
  • 1 cup plain flour
  • 1 tbsp cinnamon
  • 1 tbsp mixes spice
  • 1 tbsp gratted nutmeg
  • 1.5 tbsp baking powder
  • 1/2 cup bread crumbs
  • 3 tbsp black treacle (I dont have “browning liquid”)

Process:

  • Pre-heat oven at 180C. Grease a cake tin.
  • Cream butter and brown sugar in a bowl. Use a wooden spoon initially and then you can use this whisk. The video use an electric whisker but I think I managed a decent mixture. You want something creamy and fluffy.
  • In another bowl, mix the eggs, lime juice and lime zest.
  • Add the egg mix to the butter mix bit by bit, whisking constantly.
  • In another bowl, mix the blended fruit, vanilla, almond liquor, rum and Port.
  • And the fruit mix to the butter/egg mix bit by bit, whisking constantly.
  • Clean one of the bowl. Add the flour, cinnamon, mixes spice, nutmeg, baking powder and bread crumbs. Mix well.
  • Add the flour to the wet mix, bit by bit and mixing constantly.
  • Finally, add the black treacle that should bit the dark color to the cake. Mix well.
  • Pour the cake mix into the tin. Shake until level.
  • Put a small bowl with water in the oven or spray with water the oven to create extra moisture.
  • Bake for 1h 15m aprox. Remove from oven only when a skewer comes out clean from the center of the cake.
  • Once you take the cake from the oven, brush it with white rum while hot.
  • Leave it cool down for 1h.

The real thing:

My thing:

To be honest, although my version doesnt look like the original one, it was tasty. I think these were my errors:

  • I didnt soak the dry fruits so they didnt blend properly. I need to find more info about how to prepare this part properly. I think this is the reason the cake is not as “dense” as the original.
  • I think I over baked it. I lost track of time and it was 1h 30m I think.
  • The black treacle doesnt give the same black color as in the video. Or I need to put more?
  • Use more water in the oven. The first video didnt use any but the second did it so I though the second version was more moist and I wanted that.
  • Although I didnt use Jamaican rum neither Jamaican red wine, the taste was good.