Intel® Xeon® 6 Processors Power AI

Intel® Xeon® 6 Processors Power AI Everywhere

October 2, 2024

Georganne Benesch

Product image of the front and back of Intel Xeon 6 Processor

Organizations worldwide deploy AI to increase operational efficiencies and increase their competitive standing in the market. We talk to Intel’s Mike Masci, Vice President of Product Management, Network & Edge Group, and Ryan Tabrah, Vice President & General Manager, Intel Xeon and Compute Products, about the new Intel^® Xeon^® 6 Processors. Mike and Ryan discuss key advancements that power the seamless and scalable infrastructure required for running AI everywhere—from the data center to the edge—in a more sustainable way.

Why is the launch of the Intel Xeon 6 Processors so important to Intel, your partners, and customers?

Ryan Tabrah: The launch is a culmination of many things, including getting back to our roots of delivering technology starting from the fabrication process to enable the AI data center of the future. I think Intel Xeon 6 hits at a perfect time for our customers to continue to innovate with their solutions and build out their data centers in a way that wasn’t possible before. With Intel Xeon 6 processors, E-cores are optimized for the best performance per watt, while the P-cores bring the best per-core performance for compute-intensive workloads that are pervasive in the data centers of today.

Mike Masci: We see Xeon 6 not just as another upgrade, but as a necessity for the AI-driven compute infrastructure. The existing data center does not have the performance per-watt characteristics that allow data to scale for the needs of an AI-driven era. So whether it be networks needing to process huge amounts of data from edge AI to cloud AI, the these processors do so in a more efficient and performant way. And within a data center, it enables the infrastructure to support the performance needs of AI while being able to scale linearly.

The consistency of the Xeon 6 platform from edge to cloud and the fact that it can really scale from the very high end to more cost- and power-focused, lower-end products is what developers want. They want an extremely seamless experience where there is no need to mix and match different architectures and systems, because anything that slows them down or creates friction effectively is less time spent on developing AI technology.

Intel Xeon 6 is the first Intel Xeon with efficient cores and performance cores. What are some examples of their different workloads and relevant use cases?

Mike Masci: First, efficient cores are designed and built for data center class workloads and are highly performant at optimized density and power levels. This is a huge advantage for our customers in terms of composability and the ability to partition the right product for the right workload in the right location without having to incur complexity and expense of both managing and deploying.

It’s becoming the norm to deploy the same type of workloads at the network edge that are running deep into the data center. People want the same infrastructure back and forth, so it enables them to deploy faster and easier, and save money in the long run.

The most important workloads are cloud native. And that’s where the Intel Xeon 6 E-cores shine. As we think about use cases that take advantage of that, on the network and edge side, the 5G wireless core is one of our most important segments. Where in prior generations it was fixed-function, proprietary hardware, these companies have adopted the principles behind NFV (Network Functions Virtualization) and SDN (Software Defined Networking) and are now moving toward cloud-native technology. This is where the multi-thread performance per-watt optimized piece of Intel Xeon 6 processors is extremely important.

As we look at Intel Xeon 6 with P-cores for other edge applications, customers are very excited about Intel^® Advanced Matrix Extensions (Intel^® AMX) technology. Specifically, its specialized vector ISA instructions, inherent in the performance cores, allow them to do lightweight inference on the edge where you might not have the power budget for large-scale GPUs that are typical of training clusters. And the beauty of AMX is it’s seamless from a software developer standpoint, and with tools like OpenVINO^™ and our AI Suites, they can take advantage of AMX without having to know how to program to a specific ISA.

Ryan Tabrah: The reality is that, especially at the edge, customers can’t put in some of the more power-hungry or space-hungry accelerators, and so you fall back on the more dense solutions that are already integrated into the Intel Xeon 6 performance core family.

Video is another good use case example. You don’t make money until you can effortlessly scale and pull videos out and push them to the end user. That’s one reason why we focused on the rack consolidation ability in taking a video workload. It’s something like three-to-one rack consolidation over previous generations for the same amount of videos that you can stream at the same performance. It’s better performance at a better energy efficiency in your data center, being able to serve more clients with fewer machines and greater density. And that same infrastructure can then be pushed out to your 5G networks, to the edge of your network where you’re caching videos and deploying them to end users.

Can you talk about the Intel Xeon 6 in the context of a specific vertical and use case?

Mike Masci: Take healthcare, where you need a massive amount of data to train medical image models. In order to have actionable data and insights, you need to train the model in the cloud and run it effectively at the edge. You need to run things like RAG (Retrieval Augmented Generation) to make sure the model is doing what it’s supposed to do, especially in the domain of assisting with diagnosis, for example. So what happens when you need to retrain the model? Edge machines will send more data to the cloud, where it gets retrained, and then has to get proliferated back to those edge machines. That whole process for a developer in DevOps and MLOps is an entire discipline, and it’s probably the most important discipline of AI today.

We think that the real value of AI is going to be meaningfully unlocked when you can have trained models, then you can deploy them at the edge, you can then have the edge refeed the models to get trained in the cloud. And having them on a scalable system matters a lot to developers.

Ryan Tabrah: Also, healthcare facilities around the world have a lot of older code, older applications running on kernels that they don’t want to upgrade or do any work. They want to be able to move those workloads, maybe even containerize them, put them on a system they know will just run and they don’t have to touch a thing. We enable them with open-source tools to update the parts of their infrastructure, and new data centers to bring the future into, and connect with, their older application base.

And that’s where the magic really happens, that someone doesn’t fundamentally have to start from ground zero. Healthcare institutions have all this old data, old applications, and then they’re being pushed to go do all these new things. And that’s back to Mike’s earlier comment that just having a consistent platform underneath from your edge to the actual cloud where you’re doing your development to even to your PC, they just don’t have to worry about it.

What are the sustainability aspects that Xeon 6 can bring to your customers?

Mike Masci: The performance-per-watt improvements across some of our network and edge workloads is clear. It’s 3x performance per watt versus 2nd Gen Intel^® Xeon^® Scalable Processors. Simply translated, if you get 3X performance per watt, effectively you can reduce the number of servers that you need by one third. That doesn’t just save you CPU power, but it saves you the power of the entire system, whether it be the switches or the power supply of the rack itself or any of the peripherals around it.

And it’s our mandate as Intel to drive that type of sustainability mechanism, because in large part the CPU performance per watt dictates the choices that people make in terms of deploying overall hardware.

A great example is the work we’ve done with Ericsson, a leading ISV provider in the 5G core. In their own real-world testing on UPF, which is the user plane function of the 5G Core, they had 2.7x better performance per watt versus the previous generation. Even more, in the control plane with 2 million subscribers, Ericsson supported the same number of subscribers with 60% less power versus prior generation. This comes back to the performance per watt and sustainability. But it is also about significant OpEx saving and doing a lot of good for the world as well. With Ericsson, we are proving it’s not just possible, but it’s happening in reality today.

In this domain we have our infrastructure power manager, which allows for dynamically programming the CPU power and performance based on actual usage. For example, when the load is low, the CPUs power themselves down. And underlying that is the entirety of the product line has huge improvements in terms of what we would call load line performance. Most servers today are not run at full utilization all the time. Intel CPUs like the Intel Xeon 6 do a great job of powering down to align with lower utilization scenarios, which again lowers overall power need—improving platform sustainability.

This seems fundamental, but it’s harder to do than you would think. You need to optimize at an operating-system level to be able to take advantage of those power states. You need to make sure that you have the right quality of service, SLA, and uptime, which is a huge deal.

Ryan Tabrah: The efforts we make across the board—in our fabrication, our validation labs, and our supply chain that feeds all our manufacturing—demonstrates our leadership in sustainability. When a customer knows they’re using Intel silicon, they know that when it was born or tested or validated or created, it was done in the most sustainable way. We’re also continuing to drive leadership in different parts of the world around reuse of water and other things that give back to the environment as we build products.

Intel Xeon 6 offers our customers the opportunity to meet their sustainability goals as well. With the high core counts and efficiency that Intel Xeon 6 brings, our customers can look to replace aging servers in their data center and consolidate to fewer servers that require less energy and floor space.

Let’s touch on data security and Intel Xeon 6 enhancements that make it easier for developers to build more secure solutions.

Mike Masci: As we look at security enhancements, which is paramount, especially on the network and edge, bringing our SGX and TDX technologies was a big addition. But technology to maturity in terms of security ecosystem is extremely important for customers, especially in an AI-driven world. You need to have model security. You need to be able to have secure enclaves if you’re going to run multi-tenancy, for example, which is becoming extremely important in a cloud-native-driven world. And overall, we really see that maturity of security technologies on Intel Xeon 6 being a differentiator.

Ryan Tabrah: We built Intel Xeon 6 and the platform with security as the foundational building block from the ground up. It’s what we’ve been doing for several generations of Xeon, and we’re making confidential computing as easy and fundamental as possible in the partner ecosystem. With Intel Xeon 6 we are introducing new advances in quantum resistance and platform hardening to enable customers to meet their business goals with security, privacy, and compliance.

Is there anything that you’d like to add in closing?

Mike Masci: Intel Xeon 6 is in a position that’s necessary for AI at the edge and in the network. And we think the idea of making an easy, frictionless platform that also serves multiple workloads easily with composability, is a home run. To me that is the key message of Intel Xeon 6. It’s seamless and scalable so that you can have the same application running on the edge that you have in the data center and without worrying about what hardware it’s running on.

Ryan Tabrah: I agree. Especially in different environments and areas where people are just fundamentally running out of power in their data centers, whether it’s just because they can’t build them fast enough or there are new restrictions and clean energy requirements. We have the solutions in place from their edge to their data centers that just make it super easy for them to see the benefits.

And the best validation, I think, is that it is the feedback from the customers. They want more of it. They want to do more with us. They want to help us not only ramp up the processors as quickly as possible, but then build the next generation as quickly as possible, too. Because they’re excited that Intel is taking a leadership position in key critical parts of telco, edge buildout, infrastructure buildout, and data center, and we are excited to be leading with them.

Edited by Christina Cardoza, Editorial Director for insight.tech.