The commoditization of AI

Earlier this week, a Chinese company released a new AI model called DeepSeek, that shattered the big tech giants that tried to own AI. Seeing this, brings some memories from the past.

In the 1990s, there was a tech revolution called “The Internet”. According to Our world in data, in 1990 there were about 2.6 million of global internet users. By the end of the decade, this number increased by more than 15,000% up to 416.2 million of global internet users. From that moment on, the increase was almost exponential. One of the companies that caught this trend of “the internet” was Cisco, and around 1997 their stock started to climbed and in March 2000 it’s price reached $82 per stock, putting it at a total market cap of $536B. This was the highest stock price, and market cap, that Cisco has ever reached. It stayed there for a year, until crashing back to $15 per stock on March 2001. Today, it’s stock trades at around $60, and its market cap is about $230B. What caused this fast climb and sharp decline, you might ask?

The 2000s were an interesting time, when the entire tech world has lost its mind, which eventually led to the dot-com bubble bursting. Seeing this new shiny thing called “the internet” becoming accessible to more and more people, led investors to seek the next unicorn to put their money in. Cisco was an established networking equipment company, and it continued to expand its offerings by acquiring other companies that were doing networking, voice over IP (VoIP), digital video, LAN switching, etc. And despite how much tech bros and investors will sell you the “we want a better world for everyone”, what they truly want (at least as number #1 priority) — are profits for themselves. Hence, they were looking for someone to “own” the internet. What they failed to account for is that the internet will become a commodity, which it did. But what it has to do with DeepSeek and AI?

You see, the entire AI tech bubble is built on the assumption that doing AI is hard, and it requires billions of dollars of investments. Just few days ago companies like Microsoft, OpenAI, and Oracle announced project Stargate which will require a $500B investment in order to build nuclear power plants, data centers, and acquire hardware to train AI models. AI seemed to be out of reach for the common folks, forever to be in the hand of the big corporations with deep pockets, who will profit massively from it. But then came DeepSeek, which released a model that is at least as capable as the top model from OpenAI, if not more capable. They not only set a new bar for AI models, but they also released the model as open source, and claim to have trained it on older hardware for less than $6M compared to the billions that US tech leaders are pouring into AI.

This caused Nvidia to lose almost $600B of market value, a 17% drop, one of the biggest market loss in history. There is no simple, or logical explanation as to why this happened, because Nvidia still seems to be the major provider of GPUs for AI purposes, and it’s not like AI is going anywhere. Nvidia had a sharp climb since mid 2023, just 6 months after releasing ChatGPT, so this might be the market correcting itself. Or as I always said, nobody knows what the stock market is, and why things happen as they do. And while DeepSeek claim to have used older hardware, which could be a plausible claim, there is no real way to validate their claim as China not supposed to have access to modern GPU chips due to export bans, and they might have inclination to hide such information.

Back in the early 1990s, the internet was not accessible to most people. Some businesses invested into getting equipment in order to be able to access the internet, and a select few from the public used to visit Internet Cafés — public places with computers connected to the internet. Personal computers were often sold without the expensive addition of Ethernet cards that allowed your computer to be connected to the modem. Moreover, adjusted to inflation, the cost of getting a modem varied anywhere between $600 to $1000, and that’s in addition to having an internet capable computer. In short, this technology was out of reach for most people. Fast-forward to today, Ethernet is built into every single computer, and you can get a router for as little as $50.

Just like with the internet, AI is going to transform. From being tightly controlled by a group of tech people to be run on expensive hardware, to being freely available and run on consumer grade hardware. There are people today who can run the most capable DeepSeek model at home. Of course, it is not affordable to many yet. Most use a cluster of 6 to 8 Mac Minis which can easily reach the cost of $10K; and those who run it on PCs, require server grade motherboards that support hundreds of GBs of RAM, as well as Nvidia GPUs which are price hiked at the moment.

But make no mistake, we will eventually arrive at a future where AI will be run locally on your devices. With Apple investing into Neural Engines in their M-series CPUs, to dedicated AI boxes from Nvidia and AMD, we will get there. And just like with the internet, the AI bubble will burst, and AI will become a commodity. You probably won’t be able to train an AI model at home, as it will still require money and hardware, but just like with the internet, you are not required to lay down your own undersea cables in order to speak with your friends on the other side of the world.

But what about AGI?

AGI, or Artificial General Intelligence, is something that many of these AI companies claim to move towards in their development. Thus, the first one to reach AGI, will be the first one to set the rules. It’s like the atomic bomb race — you either get it first and set the rules, or someone else does. AGI, if achieved, will be a big change in the way the world works. And so many of these tech companies seem to incorporate into their value the promise of reaching AGI. A promise that we’ve heard to come alive next year… every year. And yet there are no signs for AGI, and I don’t think we will see it, at least not as a by product of LLMs.

When ChatGPT came out I, like many others, have seen it as this magical thing. I didn’t really know how to treat it, or how it works. My view of LLMs swung all the way from “they are bad and stupid” to “they are good and useful”, and with every new model, and every new use-case, my opinion is still shifting. But it really clicked for me when I decided to learn, at least high level, what GPTs are and how do they work. I highly recommend this video by 3Blue1Brown about what are LLMs and how do they work.

And after viewing this video, I now see LLMs for what they are — a mathematical model to predict the next most probable word/token. This not only broaden my knowledge, but it also allowed me to use and prompt LLMs in a more efficient way, as I now understand what are their strengths and limitations. But what it has to do with AGI?

For as long as I remember myself, I was bothered with the hardest question of humanity: what is free choice and is it truly free, rather than preselected/preprogrammed by the environment? In short, can we predict certain human actions/interactions by observing/learning about their environment. I don’t have the answer, but it is an interesting thought experiment nevertheless, and it aligns perfectly with LLMs and AGI. You see, LLMs are predictable. I tried both ChatGPT and DeepSeek for the past couple of days, and both outputted similar answers to some of the question(s). Sure, some models perform better in specific tasks like coding, but the general gist is that they can’t think in a sense a human can. Nor they step outside their mathematical model.

Humans are predictable as well, but the difference between human behavior and a mathematical model is that we can break the rules. The other day I was sparring with my boxing coach. He threw a jab followed by a right hook, which I blocked. Then, he repeated the combination, which I blocked. He did it again, which I blocked again. And then he threw a jab, followed by right hook (which I blocked), followed by left hook which shoot me to the ropes. In a sense, he conditioned me by his behavior (input data) to a certain reaction (output data), which can be seen as a training of an LLM. For the same input, he got the same output, albeit a bit different as I was anticipating the combination, thus performing better/different defense sequences. However, him being an experienced fighter, he was able to see and read my reaction in order to take advantage of my situation. Should I have reacted differently, or been a more experienced fighter, he would probably fail to throw me to the ropes.

But boxing is not a mathematical function, and it requires many skills in coordination, reading the environment, and fast reaction. And there is always an element of surprise, or irrationality. This can be seen as randomness by the outside observer, but it’s not. It’s something in a though process of the brain where people are able to break outside the box (which LLMs are constrained to). Moreover, it’s something more than thinking. I don’t like to talk in meta terms (meta the word, not meta the company), but intelligence is not only the processes that happen inside our brain, but it’s also the effects of environment. Things like reading and understanding a situation, gut feeling, etc. Think about a time when you felt embarrassed to do or say something, or when you reacted instinctively in order to avoid danger. I’ve seen it many times throughout history: creative approaches to job searching, advantages on the battlefield, how certain political figures act, etc. And I don’t think that the current state of LLMs will lead us to AGI.

Don’t get me wrong, GPTs are an amazing piece of technology and research, but they are still a mathematical function, once you understand how they work. There is no magic or thinking behind that. And they are certainly not able to break outside the constraints of their mathematical function.

But what about AGI?

You might also like...

Living side-by-side with an AI

ChatGPT, AI, and the future of tech

The need for a more semantic web

Working with systemd timers

Ghost Engineers