Analysis of Microsoft Datacenter Silicon announcements at Ignite 2023

Analysis of Microsoft Datacenter Silicon announcements at Ignite 2023

Microsoft kicked off its annual Ignite event by announcing two silicon products that will be implemented in its cloud platform, Azure, and across some of its SaaS properties. The Azure Maia AI Accelerator 100 is an AI application-specific IC targeting training and inference workloads. The Azure Cobalt 100 CPU is a general-purpose cloud chip built on the Arm architecture. Both pieces of silicon expand the trend of in-house silicon design that enables cloud providers to design silicon for their own data centers and specific technology stacks.

This article was co-written by Patrick Moorhead, CEO and Chief Analyst at Moor Insights & Strategy. Let’s dive into the details of Microsoft’s announcement in the next few sections.

Full disclosure: Microsoft has a paid analytics engagement with Moor Insights & Strategy, as does Amazon Web Services, Google Cloud, Oracle Cloud, IBM Cloud, Intel, Nvidia, and AMD.

Is custom silicon part of a larger cloud trend?

Custom silicon is not the future of the cloud, it is the present. Like rivals AWS and Google Cloud, Microsoft is trying to offer AI-as-a-service that is faster and cheaper than competitors. Or at least reach parity. We’ve believed for some time that Azure and thus Microsoft SaaS services like Microsoft 365 and Dynamics 365 were at a strategic disadvantage without efficient, performance-driven custom silicon.

Microsoft needed better control over the entire stack — from racks and servers to silicon, operating system and software. The company has already made several improvements to the stack; Data center silicon was probably the last major optimization knob to turn.

The journey of Microsoft’s arm began a long time ago

Although Ignite’s announcements are surprising, they shouldn’t come as a shock to anyone who follows Microsoft, the cloud, or silicon in general, for all the reasons outlined above. In particular, the company has been courting Arm for some time in both the client and server segments and has spent significant resources improving the Windows operating system to run on Arm chips.

While the open source community has long supported Arm’s architecture, two events have accelerated its adoption: the launch of Arm’s Neoverse as a data-first architecture and AWS’s acquisition of Annapurna Labs. The first generation Neoverse was important because it demonstrated Arm’s commitment to the data center. AWS then harnessed the Annapurna technology to launch Nitro as a way to offload functionality that would otherwise steal valuable resources away from the expensive host processor. AWS’s launch of the Graviton CPU — also developed with Annapurna expertise — has elevated Arm to the rank of first citizen for a general-purpose CPU in the open source community. When the largest cloud provider deploys an architecture at scale, independent software vendors and contributors to open source projects take notice.

Graviton’s success has certainly motivated Microsoft to accelerate its Arm strategy. In 2022, the company announced a partnership with Ampere, the Arm CPU vendor, to deploy its CPUs in Azure to support scaling of cloud-native workloads. After 15 months, the company began deploying its own silicon.

Maya 100 deep dive

Maia is an ASIC designed to support training and inference of large language models. It is built on TSMC’s latest 5nm process and supports sub-8-bit data types built on the open MX standard. MX’s partnership with other silicon players (Nvidia, AMD, Intel, Qualcomm, Arm, Meta) enables faster hardware development and faster AI training and inference. Click here to learn more about the Open Compute Project Alliance.

Maia has been tested using Open AI GPT-3.5 and is currently being tested using Bing Chat and GitHub Copilot. Although performance numbers have not been released, the company said it is focused on delivering compelling benefits in terms of performance per dollar and total cost of ownership.

Microsoft’s competitors have deployed their own ASIC hardware for artificial intelligence. AWS launched its Trainium chip in 2022 after rolling out the second version of Inferentia, an AI inference chip it first introduced in 2018. Meanwhile, Google’s Tensor Process Unit has also been made available to customers since 2018.

Azure can deploy up to four Maia chipsets in a server. Although we don’t yet have any performance baseline, this appears to be a significant footprint. To support this configuration, Microsoft developed Sidekicks, a liquid cooling solution that can be quickly installed on existing racks without what Microsoft considers any major retrofit.

Cobalt 100 deep dive

Microsoft’s Cobalt 100 CPU is a 128-core, single-threaded chip that supports the Arm instruction set and is specifically designed for cloud-native and other workloads running in Azure. As of the announcement, Cobalt already powers Teams, Azure SQL, and other services running on 1P servers. As with the Maia, this chip is built on TSMC’s 5nm process and is designed to deliver the best performance per dollar. Microsoft claims up to 40% better performance per dollar compared to Arm’s current deployment with Ampere. Note that the comparison is with the first generation Ampere part, not with the newer AmpereOne product.

We expect Microsoft to deploy Cobalt widely quickly. Because the company used Arm’s proprietary computing subsystems (CSS), it could develop Cobalt more quickly with confidence about supporting the software ecosystem. CSS is a program through which companies can take previously validated Neoverse N2 silicon and modify it to suit their specific purposes and environments. In the case of a general-purpose CPU like the Cobalt, these modifications focus on increasing power efficiency.

What this means for silicon merchant providers

Both Maia and Cobalt will likely have an impact on Microsoft’s silicon partners at the current trajectory and speed. As expected – and as is the case with Google, AWS, and Oracle – Microsoft is partnering with all CPU and GPU makers such as AMD, Intel, and Nvidia as commercial silicon providers to provide choice to its customers. Azure uses AMD, Intel, and Ampere on the CPU interface. While AMD and Intel will continue to leverage HPC services, SAP database deployment, and other specific functionality, I can’t see a scenario where Ampere will continue to be an Azure partner. I suspect that Microsoft published Ampere knowing that it had its own role that would emerge in the not-too-distant future. So the arm footprint is probably not that big.

Like Cobalt, Maia will be quickly deployed at scale across Azure. I expect Microsoft to continue pushing performance improvements. While it is difficult to predict how much customers will demand and adopt Maia for their training purposes, Microsoft’s support for OpenAI is certainly helping with market acceptance.

Concluding thoughts

Microsoft’s entry into the silicon market should be a win for Azure. Designing entire AI and cloud stacks to deliver the best performance at the lowest cost naturally benefits both customers and stakeholders.

However, does this move give Azure a significant competitive advantage over AWS or Google Cloud? Regarding cobalt, I don’t think so. While the customer experience should be improved, the real winner is Azure, because it will deliver measurable cost savings.

Maya is a little more interesting. While Azure in many ways achieves parity with its competitors just by offering ASICs for AI training, OpenAI support also helps this first-generation piece of silicon mature and be adopted. While I don’t think Maia offers any kind of knock on the competition, it certainly benefits Azure customers.

Regardless of the competitive situation, cloud providers that design and deploy custom silicon for optimal performance and reduced costs can have a significant benefit to the market. As long as these custom solutions are not tied to cloud provider lock-in.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *