Microsoft has unveiled Maia 200, the company’s latest in-house AI chip designed to speed up AI inference and improve efficiency in its data centers. The new Maia 200 is aimed at powering Microsoft services more efficiently while also reducing dependence on Nvidia hardware and software ecosystems.
Maia 200 is focused on inference, meaning it is built for running AI models in real-world applications after they are trained. Microsoft says the chip’s design combines high compute performance with a newly designed memory system and a scalable networking approach intended to support large clusters.
Where Maia 200 is being deployed
Microsoft says Maia 200 will initially be deployed in U.S. regions of the Microsoft Azure cloud. CNBC reported that Microsoft is equipping its Central U.S. data center region with Maia 200 chips and plans to bring them next to the U.S. West 3 region, followed by additional sites.
Bloomberg reported that the chip is making its way to Microsoft data centers in Iowa, with deployments headed to the Phoenix area next. MarketScreener also said the chip is already up and running at a data center in Iowa and that a second deployment site is planned in Arizona in the coming months.
Performance and architecture details
Microsoft’s EMEA news release lists performance targets of over 10 petaFLOPS at 4-bit precision (FP4) and more than 5 petaFLOPS at 8-bit precision (FP8), based on 3-nanometer technology. The same release says the networking design scales over standard Ethernet to clusters of up to 6,144 AI accelerators.
CNBC also reported that the chips use Taiwan Semiconductor Manufacturing Co.’s 3-nanometer technology and that four chips are connected together inside each server. Times of India similarly reported that four chips are connected together inside each server and that the system relies on Ethernet cables rather than the InfiniBand standard.
Microsoft describes Maia 200 as designed for compute-intensive AI inference and says it integrates into Microsoft Azure. Microsoft also says a single Maia 200 can run today’s largest AI models while leaving headroom for larger models in the future.
Cost, efficiency, and Microsoft’s software push
Microsoft has positioned Maia 200 as a more cost-effective option for inference at scale. Microsoft’s EMEA release says Maia 200 is its most efficient inference system deployed so far and claims 30% better performance per dollar than the latest-generation hardware in its fleet.
CNBC reported that Microsoft said the chip delivers 30% greater performance compared to alternatives at the same price point. Times of India also reported Microsoft’s executive vice president for cloud and AI, Scott Guthrie, described Maia 200 as the most efficient inference system Microsoft has deployed.
Beyond the chip itself, MarketScreener reported that Microsoft introduced Maia 200 alongside a software suite designed to compete with Nvidia’s CUDA environment, which has been a major advantage for Nvidia with developers. Bloomberg reported that Microsoft invited developers to start using Maia’s control software, while noting it was not clear when Azure cloud service users would be able to use servers running on the chip.
How Microsoft plans to use Maia 200
Microsoft says Maia 200 will be used for AI models from the Microsoft Superintelligence team. CNBC reported that Microsoft’s superintelligence team is headed by Mustafa Suleyman and that Maia 200 will support Microsoft 365 Copilot as well as the Microsoft Foundry service focused on improving AI models.
Microsoft’s EMEA announcement also says Maia 200 will accelerate projects including Azure AI Foundry, which it describes as an integrated and interoperable platform for developing AI applications and agents, and it will support Microsoft 365 Copilot. CNBC further reported that developers, academics, AI labs, and people contributing to open-source AI projects can request access to a preview of a software development kit.
Times of India reported that developers, academics, AI labs, and open-source contributors can apply for a preview of a software development kit as well. CNBC also reported that Microsoft’s earlier Maia 100 chip was never made available for cloud users to lease, while a Microsoft blog update said the new chip would eventually have broader customer accessibility.
