Views: 1

Artificial intelligence is not a single technology, but a set of distinct processes. Understanding this is key to grasping where the market is headed.

Broadly speaking, AI is divided into two phases: training and inference. Two distinct worlds… and increasingly separated at the technological level as well. Training vs. inference: they are not the same.

Training AI models is the most demanding part. It requires enormous amounts of data and immense computing power for days or even weeks. High-performance GPUs, led by companies like NVIDIA, clearly dominate here.

But once the model is trained, the phase truly visible to the user begins: inference. That is, when you ask an AI a question and get an answer. And here the rules of the game change.

In inference, the most important thing is not so much raw power, but response speed (latency) and energy efficiency. You don't need to recalculate the entire model, just use it in an optimized way. This has opened the door to a new generation of specialized chips.

Different companies like Groq quickly embraced this approach, developing processors specifically designed to run AI models quickly and efficiently.

NVIDIA makes its move… again. The most striking move has been NVIDIA's rapid reaction. After acquiring Groq, it took barely a couple of months to launch new inference-oriented solutions, such as the chip known as "Groq 3."

These kinds of moves reveal an obvious reality: no one wants to be left out of this new phase of the market. NVIDIA dominates training, but knows that inference is where the real volume of use—and business—will be.

But NVIDIA isn't alone. Giants like Amazon and others (Google) are also developing their own AI chips, specifically designed for their cloud services.

This is causing a significant shift: more and more companies are offering remote access to optimized AI infrastructure, without requiring large investments of their own. Instead of buying hardware, companies are renting inference capacity on demand.

A market in full transformation. We are entering a new stage. If training was the first major AI race, inference is the next.

And it is probably the most important. Why? Because it's where real-world use happens: virtual assistants, chatbots, recommendation engines, machine translation… everything relies on inference.

Moreover, growth is exponential. Every interaction with an AI generates an inference operation, and that's millions… or hundreds of millions a day.

New inference chips aim to optimize three key variables: faster responses, lower energy consumption and lower costs per operation

This is essential for AI to be viable at scale. Because it's not just about it working, but about it being profitable.

The AI chip market is booming. New players, new architectures, and new strategies are constantly emerging. The feeling is clear: there's no room for stagnation here. As they say, in this industry, if you don't run, you fly.

And in the case of artificial intelligence, it also calculates in milliseconds.

Amador Palacios

By Amador Palacios

Reflections of Amador Palacios on topics of Social and Technological News; other opinions different from mine are welcome

Leave a Reply

Your email address will not be published. Required fields are marked *

en_USEN
Desde la terraza de Amador
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.