Yes, artificial intelligence appears to be an emergent concept. Before 2017, AI models had poor performance. But by sufficiently increasing the scale of the models (training data and number of parameters), a phase transition occurred. More complex interactions between artificial neurons then emerged, giving rise to sophisticated cognitive abilities that we now qualify as "intelligent," all from very simple mathematical components.
The neural network parameters are the internal variables of an AI model. These parameters are automatically adjusted during learning from input data.
For example, OpenAI's GPT-3 model has 175 billion parameters. OpenAI's DALL-E model has 12 billion parameters. Google's Gemini Ultra model is said to have 540 billion parameters.
The number of parameters depends on the network's structure, i.e., the number of layers, the number of neurons per layer, and the type of connection between layers.
The number of parameters is given by the following formula: P = (d+1)h + (h+1)o, where d is the number of input neurons, h is the number of hidden neurons, and o is the number of output neurons. The +1 term corresponds to the bias, which is an additional parameter added to each layer to avoid the natural human tendency to favor certain results.
The number of parameters influences a neural network's learning capacity and thus the performance and behavior of the model. The more parameters there are, the more the system can produce correct and consistent results. However, there is a limit!
A phenomenon called "overfitting" penalizes the system when the number of parameters is too large relative to the amount of available data.
If you want to increase the number of parameters in a neural network, you must also increase the training data.
This explains why operators' appetite for our data is insatiable.
In artificial neural networks (ANNs), each neuron performs a calculation that is a weighted sum of its inputs, then applies a threshold activation function to determine its output, which it provides to the next layer.
From this simple mathematical process, a singularity appears.
The network only predicts the next word or the next "token" (piece of a word) that will follow in the sentence. And yet, an ordered, rational, and coherent sentence emerges, even though it comes from a probabilistic process.
For this language magician who juggles words without caring about their meaning, the notion of truth is not relevant. The system does not seek to provide accurate answers, but rather probable sentences.
In other words, a system that has no connection to our reality, devoid of meaning and knowledge, and which does not distinguish between "true" and "false," can provide an "intelligent" answer.
It is thanks to its immense training corpus that AI gives the impression of understanding the context of the sentence, the author's intention, and the nuances of language.
There is something deeply troubling about this manifestation!!
How can a phenomenon as complex and sophisticated as intelligence emerge in a virtual environment?
Examples of Emergent Concepts
- Just after the Big Bang, the universe was extremely hot and dense. In this extreme environment, matter emerged from pure energy, in accordance with Einstein's equation, E=mc². Thus, elementary particles such as quarks, electrons, and neutrinos, which did not exist before, emerged from the primordial universe.
- Life is an emergent phenomenon; it results from the interaction of simpler components, such as the chemical molecules that constitute it. Yet it presents new and irreducible properties to these components. From a certain molecular organization, it appears in an environment where it did not exist before.
An emergent concept arises from a more fundamental concept while remaining new and irreducible to it. In other words, new properties appear with the emergent concept from an environment where it was not previously present. These new properties seem to be a natural response to the specific physical conditions of an environment.
AI models before 2017 were trained on much smaller datasets than those used today. They were far from perfect; generative AIs did not work very well.
As the data available for learning increased, data scientists intuitively increased the number of parameters. From a miraculous threshold, they observed a significant improvement in results.
This phenomenon occurred in 2017 with the GPT-2 model (Generative Pre-trained Transformer 2), which marked a turning point in the field of text generation by demonstrating its ability to produce human-quality texts.
What happened?
Before 2017, the scale of models (training data and neural architectures) was increasing, but nothing was happening; performances were poor and stagnant. Then suddenly, when the scale reached a threshold, there was a phase transition. In other words, a physical change in the state of the system, caused by the diversity of data and parameters.
Suddenly, richer, deeper, and more complex interactions between neurons appeared.
The remarkable fact in this miraculous evolution is the emergence of more sophisticated cognitive abilities that now seem "intelligent" to us.
Scientists struggle to explain this phase transition. Yet, an "intelligence" has indeed emerged mathematically from the interaction of very simple components, such as data, algorithms, models, and parameters!!
What does this emergence from a machine tell us about the nature of intelligence itself?
Machine learning is a non-linear process, meaning that small changes can lead to significant changes in the model's behavior. For now, we do not understand how models make their decisions, which makes it difficult to predict their future behaviors.
The field of AI is evolving very rapidly, with new technologies and architectures constantly appearing. From the increasing complexity of models, other unexpected properties could emerge, such as creativity, art, understanding of reality, and even consciousness.
AI was originally designed to imitate the capabilities of the human brain. To do this, it drew inspiration from models of biological neurons to create artificial neural networks.
It is likely that in the future, AI and brain research will mutually enrich each other.
By using AI and letting it evolve on its own, it is possible that it will provide us with the keys to unravel the mysteries of the human brain.
"Chance is the god of inventors." - Pierre Dac (1893-1975), French humorist.
Parameters are the internal variables of an AI model, automatically adjusted during learning. They include synaptic weights and biases. Their number depends on the network's structure (number of layers and neurons). For example, GPT-3 has 175 billion parameters. The more parameters there are (with enough data), the more the system can produce correct and consistent results.
A neural network only predicts the next most probable word (or "token") according to a statistical process. It does not seek to provide "true" answers, but probable sentences, without any connection to our reality or distinction between true and false. It is thanks to its immense training corpus that it gives the impression of understanding the context, intention, and nuances, even though it is devoid of meaning and knowledge.
Before 2017, increasing the scale of models (data and parameters) only poorly improved performance. Suddenly, with the GPT-2 model, a critical threshold was reached, causing a phase transition (a physical change in the system's state). Richer and more complex interactions between neurons appeared, giving rise to sophisticated cognitive abilities. Scientists still struggle to fully explain this phenomenon.