Skip to content
hw.dev
hw.dev/signal/tensordyne-lns-inference-chip-tape-out-2026
SignalIEEE Spectrum

Tensordyne Tapes Out a Logarithmic Number System AI Chip, Claims 10x Power Over GPU Inference

Tensordyne taped out its first inference chip using logarithmic number arithmetic instead of IEEE floating-point, claiming an order-of-magnitude power reduction vs GPU -- a bet that the dominant compute primitive for AI is wrong.

#ai-hardware#semiconductor
Read Original

Tensordyne just taped out an inference chip that does not use floating-point arithmetic. It uses a Logarithmic Number System (LNS), where multiplications in log space become integer additions, and the hardware area and power budget required to implement a multiply-accumulate unit drops dramatically. The company claims an order-of-magnitude improvement in power per token compared with leading GPU inference alternatives.

The constraint being removed is not just power efficiency. It is the assumption that IEEE 754 floating-point is the right primitive for AI inference at all. Tensordyne's argument -- which has solid theoretical grounding going back decades in signal processing and is increasingly validated by the reduced precision findings from quantization research -- is that modern transformers do not need the dynamic range or precision that floating-point provides. LNS natively handles the extreme value distributions in attention weights without the costly exception logic that inflates GPU die area. If the tape-out validates the claims, LNS transitions from a research curiosity to a credible threat to the standard inference stack.

The immediate question is silicon yield and software compatibility. LNS arithmetic requires a different compiler path, and today's inference stacks (vLLM, TGI, TensorRT-LLM) assume floating-point or integer quantized compute. Tensordyne will need to ship an abstraction layer that makes the hardware look like existing deployment targets, or attract workloads with enough inference-per-watt pressure -- at the edge or in co-location deployments where power costs dominate -- to justify a toolchain port. If they clear silicon in 2026, expect the GPU inference pricing model to face a new reference point within 18 months.