The biological roots of AI research

2021.04.07

As a biologist who has switched career, I have always been interested in the biological background of artificial intelligence research. A defining point in this story was an article published in 1943 by a 20-year-old university student brought up in hardship, Walter Pitts, and neurologist Warren Sturgis McCulloch, who was in his forties: A logical calculus of the ideas immanent in nervous activity (PDF). Pitts and McCulloch wrote about what was known about neurons at the time, and they went on to argue that the functioning of the brain resembled to a considerable extent that which in logic and the embryonic IT is called digital logic gates (AND, OR, XOR, NOT, NAND etc.).

The ‘output’ of a neuron, the so-called axon, unlike the vast majority of biological systems operates not in an analogue mode but digitally: it either emits a signal or does not (see FALSE or TRUE, furthermore, 0 or 1 in logic and computer technology). The signal emitted by the cell is triggered by the set of signals arriving at the axons of the associated other neurons as ‘inputs’. If that reaches a threshold value, then it triggers a signal in the cell output, but if it does not reach the threshold, then it does not trigger a signal. A branched axon of a neuron can make a connection to several other neurons or connect to the next neuron with multiple nerve endings. In the latter case, the ‘weight’ of the connection will be greater. In the case of the following diagrams taken from the original paper, at least two input signals are needed to trigger an output signal, but a negative contact marked with a blank circle is able to block this.

According to the two authors, logical and computer technology neural networks can be assembled from such neuronal models, which with correct configuration can perform many types of logical operations similarly to the way our brain makes decisions.

In the years and decades since this paper was published, a start was made on studying – on this principle – the behaviour of neural networks comprising layers of simulated neurons that had huge numbers of connections between the layers. Initially, they worked with just one or two layers, but it turned out that more abstract problems could only be resolved using intermediate – so-called hidden – layers. These layers are not connected directly to the inputs and outputs of the entire neural network but instead just to the cells of the preceding and next layers.

In the case of extremely simple neural networks, configuration of the connections could even be done manually by a specialist – at that time using all kinds of switches and resistors. This is how Frank Roseblatt’s Perceptron demonstrated in 1958 was ‘teachable’; it was capable of categorizing simple figures seen on its 20 x 20-pixel camera. However, the configuration, that is, teaching of more complex, bigger, multi-layer neural networks could not realistically be achieved in this manner. Today, this work can be carried out using the backpropagation mathematical sequence. In the course of teaching this neural network, counting back from the magnitude of error, fault, it tries to improve the parameters of the network in small steps until the network works sufficiently precisely or until performance improvements cannot be achieved with further modifications.

However, this mathematical solution was only fully worked out a good few decades after the discovery of neural networks. Until then, something else had to be developed and AI researchers once again went back to biology, in this case to the theory of evolution, the algorithm of evolution.

The natural selection algorithm of evolution can be generalized as follows: if certain objects exist (in the biological case, organisms), which

are capable of reproduction,
descendants inherit their characteristics,
these characteristics can, randomly, be modified to a small extent, and
the characteristics influence their reproductive success,

then these objects change from generation to generation is such a way that they will adjust to their environment increasingly effectively.

In the case of neural networks, the connections, structure and parameters of the network represent the characteristics, and adjustment means how well the network performs in recognition and classification. They started from a virtually randomly parameterized neural network (object), and then from it they made a large number of copies (reproduction) on the computer in such a way that in the meantime a few randomly selected parameters of the ‘descendant’ networks were changed randomly (mutations). These new networks were tested with teaching patterns and poor performers were rejected (selection). A slightly imperfect copy was made again from a few of the best, and so on. Through such highly automatizable random and absolutely not random processes (mutation and selection), after a sufficient number of generations the neural networks had changed in a way that they performed the task to be learned to a high standard. For example, they recognized the licence plates of cars driving into a garage with suitable accuracy even when there was not sufficient illumination, it was raining or the licence plate was dirty. The success of the evolution algorithm was in itself an important result in the teaching of neural networks but at the same time it was confirmation towards biology that, after all, the natural selection process as proposed by Charles Darwin and Alfred Russel Wallace back in the middle of the 19th century, and since then proven time and again, is capable of establishing astoundingly complex and well operating systems even in non-biological systems.

Over the past few decades, the use of evolution teaching algorithms has been overshadowed due to the success of the abovementioned backpropagation-based mathematical solutions. This is precisely why, when preparing to write this article, I was surprised to discover a few studies predicting the potential renaissance of this old method. Members of the Uber AI R&D laboratory have reported on the results of several research projects, according to which they were able to successfully teach rather extensive neural networks with evolution algorithms, and in some cases more efficiently than with the backpropagation method. They perfected the algorithm in several points and the revival of the old method is enhanced by the fact that today, the ‘reproduction’ of networks and testing of ‘organisms’ are assisted by the new parallel programming methods and special hardware supporting this. It is evident that AI research has much to thank the achievements of biology for.

In the next part of the series I will cover what AI has to say about the errors, distortions of the human mind, our bad habits and our addictions. Because it is as if certain AI solutions are starting to display these characteristics themselves, which is perhaps not so surprising given the biological heritage.

Hraskó Gábor

Head of CCS Division, INNObyte