Jericho3 is the new chip from Broadcom to take into NVIDIA infiniband. From that article, I dont really understand the “Ramon3” fabric. It seems it can support 18 ports at 800G (based on 144 serdes at 100G). It has 160 SerDes (16Tbs) for uplink to Ramon3. The goal is to reduce the time the nodes wait on the network so it is not just (port to port) latency. Based on Broadcom testing swapping a 200Gb Infiniband switch with a Jericho3 is 10% better. As well, dont understand what they mean by “perfect load balancing” because the flow size matters (from my point of view) and “congestion free”. Having this working at scale… looks interesting…
But then we have the answer from NVIDIA: spectrum-X. So it is Spectrum-4 switches, with Bluefield3 DPU and software optimization. This is an Ethernet platform. Spectrum-4 looks very impressive definitely. But this sentence, puzzles me “The world’s top hyperscalers are adopting NVIDIA Spectrum-X, including industry-leading cloud innovators.” But most of links I have been reading lately are saying that Azure, Meta, Google are using Infiniband. Now NVIDIA says top hyperscales are adopting Spectrum-X, when Spectrum-4 started shipping this quarter?
And finally, why NVIDIA is pushing for Ethernet and Infiniband? I think this is a good link for that. Based on NVIDIA CEO, Infiniband is great and nearly “free” if you build for very specific application (supercomputers, etc). But for multi-tenant, you want Ethernet. So that kind of explains why hyperscalers likeAWS, GCP, Azure want at the end of the day Ethernet, at least for customers access. At the end of the day, if you have just one (commodity) network, it is cheaper and easier to run/maintain. You dont have a vendor lock like IB.
Will see what happens with all these crazy AI/LLM/ML etc.