Skip to content
hw.dev
hw.dev/signal/sifive-p570-gen3-rva23-edge-ai
SignalSiFive

SiFive P570 Gen 3 Closes the AI Performance Gap for Open RISC-V Silicon

SiFive's third-generation P570 shows 21x AI workload gains over its Gen 1 baseline using a mandatory 128-bit VLEN vector pipeline under RVA23, pushing open RISC-V silicon into territory that previously required proprietary ISA compute blocks.

#risc-v#ai-hardware#embedded#tools
Read Original

SiFive's Performance P570 Gen 3 benchmarks at 21x AI workload gain over the first-generation baseline, and the mechanism is not a bolted-on accelerator. It is the 128-bit VLEN vector pipeline running under mandatory RVA23 ISA semantics instead of optional V-extension probing. That distinction matters more than the number. Optional extensions mean software has to detect, branch, and carry fallback paths. RVA23 pins the profile, so Linux distributions, compilers, and ML frameworks can target VLEN=128 as a baseline without runtime dispatch. The Gen 3 also delivers 7-13% scalar SpecInt improvement and 13% power reduction over the P550 Gen 1, though those are secondary to the portability shift.

The mechanism driving the 21x gain is that first-generation RISC-V vector workloads were constrained by per-element scalar fallbacks and inconsistent VLEN across implementations. P570 Gen 3 pairs the mandatory RVA23 profile with Zvkng, Zvksg, Zicfilp, and Zicfiss extensions, giving cryptographic acceleration and control-flow integrity as first-class primitives rather than optional extensions the toolchain has to probe for. An ML runtime targeting RVA23 with these extensions runs on the P570 and on every other conformant RVA23 implementation without a recompile. That portability removes the "proprietary ISA for compute, RISC-V for control" compromise that still drives most heterogeneous SoC partitioning decisions today.

The assumption that collapses here is that RISC-V compute performance requires a closed AI accelerator subsystem alongside it. Hardware teams sourcing SoC IP in 2026 should run the math: if the open vector compute block delivers 21x AI throughput against its own baseline, the incremental cost of the closed accelerator subsystem (licensing, integration, toolchain fragmentation, long-term vendor dependency) is now the variable to justify. The teams still paying for a proprietary ML cluster on top of RISC-V control cores have 12-18 months before the RVA23 ecosystem proves or disproves the crossover at production scale.