Skip to content
hw.dev
hw.dev/signal/broadcom-xdsip-3d-chiplet-interconnect
SignalThe Next Platform

Broadcom's 3.5D XDSiP Cuts Chiplet Interconnect Power 15-25x as Fujitsu Monaka Samples Return

Broadcom's 3.5D XDSiP collapses chiplet interconnect energy from 3-5 pJ/bit off-chip to under 0.2 pJ/bit, with Fujitsu Monaka's 144-core Arm server chip confirmed as a real customer: samples returned from Broadcom's packaging process in February.

#chiplets#ai-hardware#semiconductor
Read Original

Broadcom's 3.5D Extreme Dimension System in Package cuts die-to-die interconnect energy to under 0.2 picojoules per bit, against 3-5 pJ/bit for off-chip SerDes links over motherboard traces. That is a 15-25x reduction in the power budget for moving data between compute chiplets and HBM stacks. The technology collapses what was a 4-card compute cluster onto a single socket by stacking multiple compute dies vertically with up to 12 HBM stacks. Fujitsu returned first samples of the Monaka chip from Broadcom's packaging line in late February: 144 Arm cores, mixed 2nm and 5nm chiplets, targeting server workloads for a 2027 launch.

The mechanism is geometry. Off-chip signals cross centimeters of PCB trace with lossy SerDes encoding. Die-to-die links inside a 3.5D package cross microns with no serialization overhead. Shorter distance means less RC loss, less signal conditioning, less power. The same principle applies to latency: for workloads with tight attention computation or KV cache access patterns, shaving 50-100ns off every HBM round-trip compounds across millions of tokens.

Vendors shipping flat 2D chiplet designs on a silicon interposer will face roughly a 25x interconnect power disadvantage on data movement relative to a 3.5D stacked design at the same compute density. That gap does not kill 2D chiplets for all applications, but it draws a hard line: any workload where memory bandwidth utilization is the ceiling belongs in a 3.5D socket within three years. Monaka in 2027 is the first credible production proof point that this packaging approach works at 144-core server scale.