The large image: Nvidia holds wherever from 75 to 90 % of the AI chip market. Regardless of or relatively due to this market dominance, massive tech rivals and hyperscalers have been creating their very own {hardware} and accelerators to chip away on the firm’s AI empire. Microsoft is now prepared to point out the main points of its first customized AI chip.
Microsoft launched its first AI accelerator, Maia 100, at this yr’s Scorching Chips convention. It options an structure that makes use of customized server boards, racks, and software program to offer cost-effective, enhanced options and efficiency for AI-based workloads. Redmond designed the customized accelerator to run OpenAI fashions by itself Azure knowledge facilities.
The chips are constructed on TSMC’s 5nm course of node and are provisioned as 500w components however can help as much as a 700w TDP.
Maia’s design can ship excessive ranges of total efficiency whereas effectively managing its focused workload’s total energy draw. The accelerator additionally options 64GB of HBM2E, a step down from the Nvidia H100’s 80GB and the B200’s 192GB of HBM3E.
In response to Microsoft, the Maia 100 SoC structure includes a high-speed tensor unit (16xRx16) providing fast processing for coaching and inferencing whereas supporting a variety of information sorts, together with low precision sorts akin to Microsoft’s MX format.
It has a loosely coupled superscalar engine (vector processor) constructed with customized ISA to help knowledge sorts, together with FP32 and BF16, a Direct Reminiscence Entry engine supporting totally different tensor sharding schemes, and {hardware} semaphores that allow asynchronous programming.
The Maia 100 AI accelerator additionally offers builders with the Maia SDK. The equipment contains instruments enabling AI builders to shortly port fashions beforehand written in Pytorch and Triton.
The SDK contains framework integration, developer instruments, two programming fashions, and compilers. It additionally has optimized compute and communication kernels, the Maia Host/System Runtime, a {hardware} abstraction layer supporting reminiscence allocation kernel launches, scheduling, and system administration.
Microsoft has supplied extra data on the SDK, Maia’s backend community protocol, and optimization in its Inside Maia 100 weblog submit. It makes learn for builders and AI lovers.