Tensordyne Napier AI Processor Announced with Logarithmic Math
ServeTheHome — tensordyne Napier is a new AI accelerator targeting the math behind AI inference and its own 72 accelerator architecture The post Tensordyne Napier AI Processor Announced with Logarithmic Math
For now, Napier is still a taped-out chip and 2027 system roadmap, so the big question is whether the performance and software claims survive contact with real deployments. Tensordyne is positioning Napier as a way to attack both the speed and the cost of AI inference. Instead of building only around more conventional matrix-multiply resources, the company says its logarithmic math approach turns multiplication operations into additions. Adders are smaller and generally lower-power than multipliers, so the promise is more useful silicon area for memory and better system balance. To that end, it is announcing an ecosystem to not just have a chip, but a cluster architecture. That matters because a lot of today’s AI infrastructure discussion is no longer just about peak accelerator TOPS or FLOPS. Long-context inference, agentic workflows, and mixture-of-experts models can become constrained by memory, interconnect, decode throughput, rack power, and cooling. Tensordyne’s argument is that a more balanced chip and rack design can deliver more tokens per rack and more tokens per megawatt than current high-end alternatives. Tensordyne compares its TDN72 cluster against larger multi-rack configurations for two-trillion-parameter GPT MoE models. In that comparison, the company says one 120kW TDN72 rack can reach 1,300 tokens per second per user, while NVIDIA and Groq require nine racks and 1.5MW, and AWS plus Cerebras require fourteen racks and 800kW. Those comparisons are attention-grabbing, but Tensordyne is announcing a product at this point. A full TDN72 system is designed around 72 accelerators, 68 petaflops of total compute, and 42TB of HBM. Tensordyne says its capacity is aimed at models with up to 10 trillion to 20 trillion parameters, where the memory footprint and expert routing become major system-level challenges. This is also where rack-scale design matters, since simply adding accelerators does not help if the interconnect, memory, power, or cooling infrastructure becomes the limiting factor. Napier itself is a 3nm TSMC chip with 138 billion transistors.