Do NPUs Mean the Death of GPUs for AI?

Image from Pixabay

AMD, Intel and Qualcomm are building NPUs (neural processing units) into their smartphone and PC solutions. Microsoft has already created an NPU, and OpenAI is talking about creating its own NPU, as well. NPUs are Neural Processing Units that are specifically focused on handling AI workloads. So, does that mean GPU (graphics processing unit) solutions are, or soon will be, obsolete?

Well, just like integrated graphics didn’t take out discrete graphics, NPUs (with the possible exception of those created specifically for certain AI engines like the one OpenAI is thinking about building) are unlikely to displace GPUs as the AI engine of choice.

Let me explain.

NPU Advantage

The advantage an NPU has over a GPU is one of focus and efficiency. The NPU is designed to provide a base level of AI support extremely efficiently but is generally limited to small, persistent solutions like AI assistants. This makes them very useful in smartphones and laptops, which will increasingly use AI interfaces but will still need to have long battery life.

Much like integrated graphics, they may be more than adequate in many solutions, and they will always have an advantage in terms of energy efficiency, but they aren’t currently high-performing. This means larger models or more advanced AIs will probably not run well, if at all, on them, much like CAD applications and other graphics-intensive programs require GPUs to operate.

At least for the near future, expect them to be focused mostly on inference applications using relatively small models.

GPU Advantage

GPUs have far more headroom, but they come with energy penalties. Far better for training and large language models, GPUs aren’t going anyplace and will continue to be favored where performance is more important than energy use, particularly where you need to train a model in a reasonable period.

You can even imagine both technologies being in place in the same hardware with the NPU, handling the light AI loads and particularly those that need to be persistent and always available. GPUs will be called up, when available, for projects that use larger, more complex models, particularly when you move from inference to training.

In effect, GPUs and NPUs can be used together to create a more scalable solution with a far greater range of capabilities. If that solution is well designed and uses the NPU most of the time and the GPU only when needed, the battery life or energy use of the solution can be minimized, making for a far more sustainable result.

OpenAI’s NPU Might Be an Exception

However, NPUs developed by AI vendors may dance to different drummers. While the processor vendors are focused on creating highly efficient, low-powered NPUs, AI companies may have different goals more focused on assuring the overall adequate performance of their AI models. OpenAI’s ChatGPT is a massive model that would typically require a GPU for training and inferencing. For them, the goal of an NPU may be more holistic. One of its stated goals is to combat a GPU shortage, suggesting a higher level of performance in that they may want a better scaling alternative to the NPU/GPU solution that will surround the chip makers.

From companies like Apple and IBM (mainframe), we’ve seen that when hardware is created in parallel with software in the same company, unique advantages may result, and those advantages could break the existing NPU/GPU partnership and create a very different dynamic.

Wrapping up

Near-term GPUs are safe. Their far higher levels of performance suggest they’ll initially be as safe as discrete graphics cards are because NPUs will be more focused on efficiency than performance. However, as companies like OpenAI create their own NPUs with the stated goal of displacing GPUs, that dynamic could change, though integrating a third-party NPU that is tied to a specific AI model could be a non-starter as other large models from other vendors emerge and are unlikely to support OpenAI’s chip for competitive reasons.

The AI market is fast-moving and far from stable, making any prediction about the future of related hardware and software risky. However, right now, it appears that GPUs are safe. Whether they remain that way will likely depend on whether an AI vendor like OpenAI can create a part or, better yet, work with an independent party to create a part that other AI vendors will choose to use. This seems unlikely now, but recalling that having a working generative AI solution was thought to be years out, unlikely things are happening in this current AI market way too often to suggest this position is absolute.

I expect it is only a matter of time before someone will create a high-performance NPU that is software-independent, maybe even NVIDIA.

Rob Enderle: As President and Principal Analyst of the Enderle Group, Rob provides regional and global companies with guidance in how to create credible dialogue with the market, target customer needs, create new business opportunities, anticipate technology changes, select vendors and products, and practice zero dollar marketing. For over 20 years Rob has worked for and with companies like Microsoft, HP, IBM, Dell, Toshiba, Gateway, Sony, USAA, Texas Instruments, AMD, Intel, Credit Suisse First Boston, ROLM, and Siemens.

View Comments (0)

Related Post