Information, Language and Intelligence

December 30, 2015

Oak

Information, Language and Intelligence

Why probability, non-determinism, time and causality may not be universal notions, why it is hard to write in functional programming languages, why more sophisticated and less generally applicable theories are needed to get more insight into a particular problem, why modern computers and “artificial intelligence” are not intelligent, why philosophy rarely uses the language of mathematics, what is common between art, literature, natural languages and computer programs, what intelligence may really mean and the central place of language in understanding it.

Information

We will define information quite informally, as although it is something quite intuitive and commonplace, it is hard to give an exact definition without imposing unnecessarily strict limitations on what the notion of information can encompass. Let’s also note that quite often information is confused with a particular way to represent it, for example, zeros and ones in a computer memory are just a representation of information and are not information themselves. At the same time one of the most important properties of information is that it can be encoded and represented by using different means. Information is a very generic notion as everything perceivable by a human is just an encoding of some information. It is a somewhat elusive notion as we really do not perceive information in its original form if there is any such form but only do that through various means in which traces of information can be observed. In short, information is something that can be encoded and interpreted.

Language

By language we understand a particular way to encode information. The notion of language as we define it is a much wider and more universal notion than just that of a human language, or programmin language, or mathematical notation. For example, music and various art forms can also be viewed as languages for expressing information. In general everything which is perceivable by an agent is information encoded by means of multiple languages which can be understood by the same agent, and human speach or executable machine instructions are just a couple of particular classes of the languages through which we can perceive information.

Simplicity of a language is determined by the ability required from an agent trying to perceive the information encoded in this language.

For example, zeros and ones stored in the computer memory may not be a simple enough language for humans to understand without some help from a computer. In case of humans an example of a simpler language is the language of numbers when it is used to describe the quantity of some type of objects.

Simplicity is therefore a relative and subjective notion for a particular agent or group of agents, as what can seem like a simple language to one agent can be prohibitively difficult to another agent.

Formality is the quality of preserving the original information being encoded. For example, intuitively it is clear that the symbolic language of propositional logic is more formal than any spoken human language when dealing with describing properties of objects and relationships between them. Accidentally, the language of propositional logic is also simpler for this particular type of information.

But how formal symbolic languages really are? It may be at first a bit surprizing that even such a formal way to represent information as mathematical notation may not be “formal enough” as some information may be lost when we try to encode it as symbols. In fact, this is what one of the most important results in the field of mathematical logic, the Gödel theorem about the incompleteness of the axiomatic arithmetic, shows: the language of general relational logic is not formal enough to axiomatically build arithmetic Gödel’s incompleteness theorems

Applicability of a language can be judged from its ability to encode information in general. For example, a spoken human language can be used to encode poetry or write the present essay, while the language of mathematics much less so, in this way the spoken language is more generally applicable than the language of mathematics.

Information encoded in one particular language can then be encoded again in another language, and we can then talk about translation from the first language to the second one. Languages can also be used to define other languages and we can reason about relationships between different languages as well. But in order to stay focused we will largely omit this discussion in the present essay.

Formality and applicability as we defined them are own properties of the language itself and do not change depending on the agent trying to interpret the language, but simplicity is a subjective quality attributed to the language by a particular agent.

Intelligence

Informally intelligence is the ability to interpret information by a given agent. In order to do that the agent should be able to construct and understand different encodings of information (different languages). These languages for a given agent should be simple enough and at the same time remain formal and applicable enough for the given subset of information. Then intelligence can be defined as the ability to construct and understand languages and is fully defined by a set of all such possible languages.

It is also quite interesting to note that the given definition of intelligence does not include any mention of agents, and is done purely in terms of languages.

In the case of human intelligence having the ability to build new sufficiently simple, formal and applicable languages leads to the ability to categorize things, abstract away non-essential details and reason about the same set of information at different levels. So called abstract thinking is therefore one of the indication of human intelligence.

In the context of intelligence language simplicity can also be understood as the amount of memory and computational resources required to retrieve the encoded information. For example, in the case of human intelligence, the language of normal form games in Game Theory may not be simple enough in all cases as a large amount of storage may be required. Another example is determining satisfiability of expressions in propositional logic, it can require exponential time and then propositional logic is also not a simple enough language for humans in some cases.

Let’s also make a remark like we did when discussing languages in general that we can further consider groups of agents, agents created by another agents in some language, relationship between different agents, etc. and talk about intelligence in every one of these cases, but this will remain out of the scope of this short essay. We can also discuss whether the intelligence of the population of agents is greater than the intelligence of every individual agent, or whether we can define the average intelligence of a given population. Therefore, further in this essay by human intelligence we simply mean the average attainable intelligence by a human agent, and we do not talk about socium or discuss how the intelligence is affected by the development of technology and many other things. These are all separate interesting topics for which a separate research can be done.

Hypothesis 1. (connection between applicability, simplicity and formality)

For a given intelligence the corresponding set of interpretable languages is such that as the applicability tends to increase, the formality and perceived simplicity tend to decrease and vice versa.

Examples:

In the case of mathematics, proposional logic is less applicable but more formal and simpler in its resolution rules than the general relational logic.
Philosophical texts tend to use spoken language as opposed to strict symbolic notation as it allows them to be more applicable and less narrow. Unfortunately this also means less formal and simple. In fact, this has likely been well understood by Plato as he deliberately chose to use dialogs as the form of his philosophical writing which makes it even less formal than the average plain philosophical text but also increases applicability and allows to reason about things in a more general way.
Computing integrals and derivations requires a simple and formal language of mathematical analysis to be developed. Yet this language is not generic enough to write general purpose novels in it or use it for communication between people and is highly specialized.
Domain specific languages in software engineering are well suited for solving some particular problem and are built using a general purpose programming language. Being as formal as the general purpose programming language they are simpler and less applicable in general.
Many art forms such as painting or music allow for different interpretations in a human language (i.e. different translations) and present an example of more applicable but less simple and less formal languages.

From this hypothesis it follows that at a given intelligence level trying to use simpler and more formal languages will lead to these languages becoming narrower. This in fact can be observed in many fields of science, where more sophisticated and formal theories are built to describe the ever growing number of specialized fields. The main reason for this apparent fragmentation and specialization of science is the limitation of human intelligence and its inability to build sufficiently applicable and at the same time formal and simple enough languages.

Human intelligence and its limitations

One particular case of intelligence, probably most familiar to us but still not understood well enough, is human intelligence. As we discussed before like any other intelligence it is also defined by the set of languages that can be constructed and understood, in this case, by an average human.

From observing a number of still not fully solved or well understood problems in the perceivable world in general and mathematics in particular we can guess that the human intelligence is actually quite limited by its nature. For example, this intelligence is apparently not able to build a universally applicable formal and simple language.

Moreover, many of the very core concepts and notions that humans are routinely using to reason with at a closer look turn out to be not all that universal and specific just to some of the languages that are constructed and understood by humans. These core concepts are seemingly not inherent in the information itself but are just artefacts of the human mind and its perception of the world. Let’s discuss a few examples.

Thinking in terms of objects, actions and changes is a common human way to imagine the world as if it indeed consists of objects that can have some state and can change after some action has been applied to them. Trees, animals, cars, houses and many other things are all perceived as objects: an object can be both a source of action for other objects or can experience an action from other objects itself. The effect of a particular action is then some change of the state of this object.

But do we really always need concepts of ‘object’, ‘action’ and ‘change’ to encode information? The answer we get from the pure mathematics, geometry and functional programming languages is clearly - ‘no’. There are multiple examples that it is possible to do just fine with an abstract symbolic notation that is not concerned with objects, actions or changes, and at the same time is formal and simple. Unfortunately it turns out that such a symbolic notation will not always be applicable widely enough and might be too overspecialized for solving some particular problem, but this has nothing to do with such a notation itself, rather it indicates the limitations of the human intelligence. Symbolic notation is usually also perceived by humans as being ‘more difficult’. For example, software engineers know it well that for many algorithms it is possible to create purely functional stateless versions, although it is much harder to do so. Or even mathematics itself is often perceived as being ‘difficult’.

Similar reasoning can be done about time, non-determinism and causality: it is possible to build formal and simple languages that do not involve either of these concepts. An interesting example of how time and non-determinism are artificial and non-essential is provided, for example, by Game Theory: there every normal form game can be rewritten as an incomplete information extensive form game which unlike the original game operates with the concept of time and sequences of actions and may potentially simplify the subsequent analysis. Yet another example we can find in statistical methods and the theory of probability: when it is ‘difficult’ to compute some values for a given model we usually can make a few assumptions and then deal with a less formal ‘non-deterministic’ model that are now much easier to analyze, although we still try to analyze the resulting simplified model in a relatively formal way.

It seems that such concepts as time, non-determinism, causality, object, action and change are merely mental devices that humans tend to incorporate in most of the languages understood by them and are not essential for representing information and perceiving the world.

Having discovered this it is interesting to recall the famous Plato dialog, Euthyphro which discusses causality and shows that often we cannot easily determine which of the several concepts is the primary one causing and defining the other concepts (in the dialog the discussion is about how piety can be defined). Now we understand why we cannot always make such a determination: the concept of causality is deeply flawed, is not universal and not applicable in this and many other cases. So it is not always correct to ask to define one thing through the others or ask what is the primary concept. The same, by the way, can be observed in many axiomatic systems in mathematics: there are always concepts that are not caused by or do not follow from other concepts, which is yet another indication that causality is not universal.

Hypothesis 2. (non-causal determinism)

The concepts of time, non-determinism, causality, object, action and change are specific to a certain class of languages that can be understood by humans and are not essential for encoding information and perceiving the world. The information and the world themselves are non-causal and deterministic.

Interestingly enough many of the things commonly observable by humans seem to be cyclical in nature, these are either simple cycles or superposition of many cycles. According to the last hypothesis the cyclical nature of things is just a pecularity of human perception and an illustion, as in reality the things are not cyclical, although they are often perceived as such due to the limitations of the human intelligence.

While discussing cycles we can also recall the modern view on the dual nature of particles in physics: according to this view it can be convenient to imagine particles as waves. Based on the hypothesis we should not be surprized if the further progress of particle physics eventually will show that modeling particles as waves or even imagining particles themselves is just an oversimplification and a mere mental device, and that the reality is different, non-changing and deterministic. Although in this particular case the true understanding of the nature of things also might be beyond any human intelligence.

How can the human intelligence be expanded or improved? In order to be able to construct more languages humans may need to perceive more, have better measurement tools and information gathering devices and this is one way to improve the human intelligence.

Another way that follows directly from the definition of intelligence is trying to make the constructed languages simple enough by using appropriate tools. It can be that some particular language is already formal and applicable in the desired domain but at the same time it is not simple enough and requires a lot of computational resources or memory that an average human does not possess. Historically for this purpose humans would use computers of different kinds, starting from abacus a few thousand years ago and ending in present with the digital machines built according to the von Neumann architecture.

Beyond human intelligence

The ability to understand languages is not unique to humans as there are other intelligences. There are much simpler ones created by humans themselves such as, for example, an executable computer program that can understand only a very limited set of inputs. Every such program can also be viewed as just a piece of information encoded by humans using some of the languages that humans understand but the resulting computer program obviously does not.

Higher intelligence

We also can notice that there are languages that humans cannot interpret well. This can be an indication of far more advanced intelligences to which humans may be related in a similar fashion like a computer program is related to a person who wrote it. Due to the already discussed limitations of human intelligence we, unfortunately, may not be able to reason well about the intelligences that are more advanced than our own as we cannot by definition fully understand all the languages understood by them. Although as the hypothesis 1 (connection between applicability, simplicity and formality) suggests it is possible to try to get a short glimpse by trying to create more formal and simple languages. Metaphorically it may be a bit similar to being able to observe what is happening outside through a very narrow key hole which significantly limits the perspective. We can only guess that these higher languages that we cannot fully understand may be similar to the formal languages of the many specialized fields of mathematics and physics in their precision with the difference being that they are much more generic and applicable. Still it may be quite hard to understand higher languages as it is impossible for a program that was written by a engineer to understand that it was written by this engineer.

Hypothesis 3. (language continuum)

For any set of information a formal language can be found that will be simple and formal for some intelligence.

Inability to understand of find such a language may merely indicate the limitations of a certain intelligence, and as we noted before perceiving a formal applicable language for a given set of information as being ‘difficult’ is subjective and is not a property of the language itself.

Therefore a simple solution can be found for any problem once an appropriate formal and simple language is found. According to this hypothesis there are no inherently difficult problems, only not capable enough intelligences that cannot understand them.

Hypothesis 4. (absolute intelligence)

There exists absolute intelligence which is capable of encoding any information in a formal and simple for that intelligence language.

It may be that all the possible information can be encoded using a simple and formal language which in this case we will call universal language. The intelligence able to understand and use this language is absolute.

Other languages for other subsets of information and corresponding intelligences itself may then just be represented in the same universal language and fully understood by this absolute intelligence. In between any given intelligence such as the human one and the absolute intelligence there can exist many other higher intelligences which will be somewhat similar to the human in the sense that they are not absolute.

Turing test

Turing test is one of the suggested criteria for determining whether an intelligence created by humans can itself be judged to have the human intelligence. In the test a human and a created intelligence or another human communicate with each other and the human tries to judge whether he is communicating with another human or the created intelligence. If no distinction can be made, then the created intelligence always passes the Turing test and is considered ‘intelligent enough’.

If some intelligence passes the test it means that it can construct and understand at least as many sufficiently simple, formal and applicable languages as an average human can. Then according to the definition given in the present essay this constructed intelligence is comparable to the average human intelligence. Therefore, it is easy to see that the linguistic definition of human intelligence given in the present essay is equivivalent to the definition of intelligence suggested earlier by Turing. In this sense the present essay expands on the ideas of Alan Turing as the validity of the Turing test follows directly from our definitions.

Modern “Artificial Intelligence” and “Machine Learning”

By artificial intelligence is usually understood any intelligence that has been created by human intelligence.

Especially lately there has been a lot of talk of artificial intelligence and machine learning in the software industry. At a closer look, however, given the linguistic definition of intelligence we notice that in this particular case the term ‘intelligence’ may be a bit misplaced.

Modern adaptive and categorization algorithms are not able to construct or understand languages, and are quite limited to solving very specific problems. Then based on our definition of intelligence it may be more appropriate to talk about adaptive algorithms or finding solutions to optimization problems rather than about ‘intelligence’. Unfortunately the linguistic nature of intelligence remains little understood, and the Turing research is often misinterpreted as well, so for such algorithms and techniques we continue to use the terms ‘intelligence’ and ‘machine learning’ which only causes further confusion and misunderstanding.

Another interesting observation is that, in fact, the very modern computer architecture itself may be limiting the applicability of the languages that can be potentially constructed by programs running on such computers. One such indication seems to be the observable ‘difficulty’ of NP-complete problems that cannot be solved in a reasonable amount of time on such architectures. And yet these are precisely problems that are often encountered in modern artificial intelligence research which in itself does not look like a mere ‘coincedence’.

Singularity

In general singularity is some abrupt qualitative change when the usual laws and assumptions stop working and the new ones are needed.

In the context of artificial intelligence singularity means a potential breakthrough that would allow to create an intelligence that will equal or exceed that of the average human and will be potentially able to create an even more advanced intelligence or replicate itself. It is also implied that the process of creating such a breakthrough intelligence should be fully understood and controlled by the human intelligence.

Is singularity possible or even inevitable? It is hard to find a definite answer. Let’s try nonetheless to define a few necessary conditions for reaching the singularity. Note, however that, like the Turing test itself, these conditions are not sufficient to judge whether the created intelligence is already advanced to such a degree that the singularity has been reached. The very fact that the singularity has been reached can remain largely unnoticed.

At the point of singularity the created intelligence will have to be able to:

pass the Turing test
create new languages and categories for encoding the information it has not yet encountered when it had been created
produce the artefacts expressed in visual art, literature, and music comparable to those made by humans
learn and proficiently use any human language
attain perfect knowledge of self so that it can create a new intelligence comparable to that of a human or replicate itself

Intelligence is considered to be ‘greater’ than some other intelligence if all the languages understood by another intelligence are also understood by it. Most likely the following hypothesis also holds true:

Hypothesis 5. (impossibility to construct greater intelligence)

Intelligence cannot construct a greater intelligence that will be able to construct a richer set of languages.

Given our definition of intelligence intuitively it is quite clear why this hypothesis may be true. Indeed, if some intelligence is able to construct another intelligence, then indirectly the original intelligence is also able to construct all the languages that the created intelligence is able to construct and the constructed intelligence is not greater than the original one.

Note, that at the same time this hypothesis allows for constructing the intelligence that will be exactly the same as the intelligence that constructed it.

Another aspect that is often omitted or misunderstood when talking about singularity is that the human intelligence as what we perceive it to be ‘at present’ may be underestimated and ‘in the future’ humans will be able to construct much more advanced languages with the help of computing devices or other yet not discovered technologies. Even if it is not possible for humans to create a greater or equal intelligence it is still possible to try to develop the human intelligence itself further. In fact the boundaries of the human intelligence and the types of languages that can be constructed by it are still not understood well enough by the modern human science.

Directions for further research

The present essay tries to give a linguistic definition of intelligence and expand on the ideas first suggested by Alan Turing.

However, we have barely touched the surface on many of the topics and the discussion itself was quite informal and far from rigorous. As we noted in the essay, some of the concepts we constantly operate with in human languages such as objects, actions and causality may not be universal and yet, paradoxically enough, these very concepts are heavily used by the essay itself to reason about more abstract and universal concepts such as information, language and intelligence. It might be that the language of the essay and any further following research can be revised and refined in this regard.

Can we use a formal symbolic notation to try to define the concepts discussed and to build a formal mathematical theory that will model them? This can also be done, but we should be mindful of the hypothesis 1 (connection between applicability, simplicity and formality) and understand that our discussion may be unnecessarily narrowed down because of this. Also when striving for ultimate formality and strictness we should remember the Gödel theorem that shows that even the language of general relational logic is not formal enough and ultimate formality likely cannot be attained anyway. These are some of the reasons why we chose not to use any formal symbolic notation in the present essay, although this direction can certainly be researched further and we may be able to get more insight for some particular cases like it has been done in many other fields of mathematics and physics.

Historically and especially in the last few centuries the research in many fields has been too centered on humans, which probably is no surprise, given that all such research was performed by humans. As we saw in the essay the human intelligence is just one particular case of intelligence in general and it may be wrong to focus exclusively on human intelligence when trying to understand intelligence in general. Even the question whether ‘the present’ human level of intelligence can be exceeded by some machine may not be all that important as it seems at first and instead the further research can focus on other interesting related questions in the field of intelligence. But still it is possible to do further research on the intelligence of groups of people, how intelligence seems to evolve and what role the surrounding conditions can play.

Equally it seems to be wrong to consider machine intelligence separate from human intelligence or other kinds of possible intelligence. In particular we disscussed that the intelligence is defined by the set of languages that are understood by it and not by an individual agent or whether this agent has a silicon, biological or some other nature.

Yet another possible research direction involves sudying the relationship between different languages, what translation between languages can really mean, what metrics can we introduce for the way information is encoded by a particular language, and what classes of languages there are. An important thing is that in any such research languages should be understood in the broader sense that we used here.