Sunday, June 19, 2011

A Short Introduction to the Philosophy of Artificial Intelligence

This is a rough draft, I participated in an interdisciplinary class and I'm thinking of submitting to maybe "Teaching Philosophy."

I. The historical background

AI is not only a rich source of new technology produced by interdisciplinary syntheses. It also, in its theoretical component, is an extension and elaboration of some of the central, canonical debates about “intelligence,” “mind” and “rationality” that have defined philosophy and psychology for hundreds of years. Specifically we find ourselves participating in the conversation that dates back to the “Early Modern” period of philosophy, roughly the 17th and 18th centuries, between so-called “Rationalists” (Descartes, Spinoza, Kant) and so-called “Empiricists” (Locke, Berkeley, Hume). The Rationalists, impressed by humans’ apparently unique ability to formalize mathematics and logic, held that the human mind was endowed with innate abilities and knowledge, and that these abilities could not be understood using the methods of natural science (these views were anticipated by Plato). The Empiricists of the 18th century Enlightenment, eager to develop a naturalistic account that integrated humans into nature, proposed a simplified psychology that essentially saw the mind as a learning machine and concentrated on perceptual psychology and learning theory. (Nowadays historians of philosophy tend to see the Rationalist/Empiricist distinction as a bit overstated, as we can see in perspective that they were all discussing the same set of issues with many of the same premises.)

An important product of this Early Modern discussion, introduced by Descartes in the first half of the 1600s (Descartes 1637) but crystallized by Kant at the end of the 1700s (Kant 1789) was the representational theory of mind. According to this view the mind works by constructing a representation of the world; Kant developed the idea of a

“conceptual framework” such that our “picture” of the world was as much a product of our own innate mental structure as it was of our perceptual experiences. Thus the issue of mental representation is an essential issue in the elaboration of the nativist/learning theorist divide as it plays out across the 19th and 20th centuries. For example, the behaviorists of the early 20th century are nothing neither more nor less than Humean empiricists: they applied “operationalist” ideas from the philosophy of science to try to develop a psychology that was cleansed of any reference to unobservable, “internal” mental “states,” including representations (mental content). On the other side the phenomenologists of the same period advanced the thoroughly Kantian argument that the study of the structure of experience would always necessarily stand apart from physical science. (Here we can stop and notice an even deeper root: the medieval question of the duality of the body and the soul.) In the middle of the 20th century the “nature/nurture” debate, as this same set of issues was then called, was of central importance in debates about the social sciences in general, a central battleground of the “culture wars” of the 1960s and 1970s. The nativist/learning theory divide also shaped the 20th century ethological literature about the mental lives of non-human animals.

II. Computation and representation

The issue of representation is central to contemporary debates about models of computation. In fact the theory of computation is yet another version of the same argument that constitutes the theory of the social sciences and the theory of ethology. Alan Turing in 1936 introduced his “Turing machine,” a thought-experiment that showed that a simple machine could instantiate any algorithm of mathematics and logic. This was a seminal moment not only in the development of computers but also in the course of artificial intelligence research. For the next fifty years many in the cognitive science community and the public at large saw “artificial intelligence” as just synonymous with computer science. Two crucial points here: first, to understand what is happening in artificial intelligence research today it is necessary to understand the computationalist era, because what we are currently living through is a departure from that era. Second, computationalism, as conceived by Turing and others, required representation: classical computation is rule-governed symbol-manipulation.

At this point we can consider some basic premises of linguistics. The classical computationalist view reached its apotheosis in 1975 with the publication of Jerry Fodor’s The Language of Thought. Noam Chomsky had launched what seemed for a time a devastating attack on behaviorism with his critique of B. F. Skinner’s 1957 book Verbal Behavior and Chomsky’s subsequent Aspects of a Theory of Syntax (1965). Chomsky argued that a syntactical structure (a grammar, or set of rules for constructing sentences and statements) was generative (it could generate novel linguistic representations and therefore novel thoughts), and was thus necessary for higher-order thought (this argument led to the sign-language research with chimpanzees of the 1960s-80s). This was, as Chomsky himself stressed, Cartesianism in a new bottle.

Fodor applied these ideas to cognitive science in general. Any representational theory of mind requires a symbolic architecture: this is simply the material instantiation of the symbols: the pixels in the computer screen, the ink marks on the page, the sound-compression waves caused by vibrating vocal chords, the chalk marks on the board. If the nervous system is a symbol-manipulating system then there must be a material instantiation of the symbols as part of the physical structure of the system. Fodor proposed that syntactical structure (the program, if you will, of the brain) could account for the causal role of the seemingly semantic mental content. This arch-computationalist view took it as axiomatic that the mind/brain necessarily involved representations.

III. Computers and the brain

Computers are our own creations, so their workings are not mysterious to us. The same thing cannot be said of the brain. Each age draws on the current technology as a metaphor/theory about how the brain works: the 17th century physicalist Thomas Hobbes, for example, drew heavily on hydraulics in his discussion of the mind. He speculated that memory might be a kind of vibration, as in a spring, that lost coherence as other vibrations passed through. In our time it is commonplace to speculate that the brain is a kind of computer and that a computer is a kind of a brain. However there are two very different approaches to developing this idea.

Classical computation is based on codes (programming languages) that contain explicit instructions for the transformation of states of the machine. The actual “machine language” is binary code (this is the meaning of “digital”). The symbolic architecture in a traditional computer is located in the “chip.” This is a series of gates that might either allow or block an electrical impulse to pass through. Thus the “1s” and “0s” of digital codes stand for actual physical states of the machine. If the human brain is also a system that functions through instantiating representation than the goal of cognitive science is to uncover the machine language of the brain: to make the connection between the psychological description of the subject and the actual physical state of the nervous system.

The brain does, in fact, possess physical features that lend themselves to a theory of symbolic architecture similar to that found in digital computers. The brain is a massive assemblage of individual neurons that interact with each other through the flow of electrical impulses (“cascades”). The impulses do not pass arbitrarily, of course; the brain shows immense organizational complexity. But essentially one neuron or group of neurons will, upon being “lit up” by a cascade of electricity, either send the event onward to the downstream neurons of fail to do so, and this can be seen as the “1/0” analog. What’s more, between neurons there is a space, the synaptic cleft, which contains a soup of neurotransmitters that buffer the electrical connection (they can be more or less conductive). So instead of an “on/off” potential, like a light switch, there is a gradient potential, like a volume control. This vastly increases the potential number of physical states of which the brain is capable. All of this constitutes a non-arbitrary reason for thinking that the brain may indeed function like a traditional computer: the synaptic pattern could be the symbolic architecture of the brain just as the disposition of the gates in the chips is the symbolic architecture of the computer.

However a new generation of computer models now challenges classical computation and its axiom that representation is necessary for computation. In this new generation of research, computers are actually modeled on brains while at the same time the new computers are contributing to new insights into how brains themselves work. This movement is sometimes referred to as “parallel distributed processing” and as the “neural net model,” but it has come to be popularly known as “connectionism.”

Classical computation has some limiting and apparently intractable problems. As anyone who has worked with computers knows, they are insufferably single-minded. This is natural, as they can only do what they are told to do by their programmers; “garbage in, garbage out.” One of the central problems for traditional computers is the “framing problem.” Consider any homonym, for example “bank.” An ordinary human has no trouble during conversation distinguishing between the two senses in sentences like “I was laying on the bank of the river” versus “I made a withdrawal from my bank.” Traditional computers are strictly limited in terms of contextualizing. This is because computers don’t actually know anything. They are devices for manipulating symbols and nothing more.

What’s more, traditional computers can’t learn anything new. They know what they are told. Now, remember the Rationalist/Empiricist debate. The Rationalists thought that there was an innate conceptual structure, incarnate in language, of essentially mathematical and logical principles, and this structure (the mind, or soul) was the source and basis of rational behavior. The Empiricists argued that a naturalistic psychology required that there be nothing more than an ability to learn from experience on the basis of trial and error, and were skeptical of non-physical states and entities. Connectionist computer models are empiricist approaches to computing in the same way that behaviorism is an empiricist approach to psychology. Connectionist machines do indeed show some primitive ability to learn on their own; they function (ideally) with no recourse to internal codes or representations; and they are solidly based on basic principles of evolutionary biology.

Connectionist machines function, as brains do, by forming patterns of activation. An input layer of nodes are electrically stimulated and this layer accordingly stimulates some number of “hidden,” internal layers which ultimately stimulate the output layer. Activation potentials can be weighted in various ways but the basic mechanism is the number of nodal connections which can constitute a threshold for downstream activation:

(Insert figure of simple connectionism: input layer, hidden layer, output layer)


This technology underlies handwriting-, voice- and facial-recognition functions that are now commonplace (an original application was for submarine sonar submarine-recognition and missile-recognition). This is achieved through trial-and-error. A trainer adjusts the activation potentials to increase correct outputs and to extinguish incorrect ones. This process does not require any internal symbolic content.

Here it is useful to note that Darwin’s model of evolution as outcomes-based selection over random variation is very much a product of empiricism. In fact Darwin was reading the Scottish Enlightenment economist Adam Smith’s 1776 Wealth of Nations, with its account of larger economic structures formed from the bottom up through iterations of economic exchanges between autonomous, self-interested individuals when he was developing his account of natural selection (Darwin 1859). An important distinction between the Rationalist program and the Empiricist one is that Rationalists tend to see complex systems as organized from the top down whereas Empiricists see complexity as emerging from the bottom up. The distinction between classical computation and connectionist computing mirrors this distinction.

However the field of AI is moving in even more radical directions. Although modern cognitive scientists will obviously disavow Cartesian dualism about the mind and the body, in a sense the Cartesian model has often been simply transposed into a brain-body distinction. On a common view it is the brain that is (now) the “cognitive theater,” the seat of representations, the CPU where thinking takes place: the same role Descartes assigned to the res cogitans (Hacker). This view underlies the assumption that AI research is simply an extension of computer science. That collective assumption is now collapsing.

IV. Robotics

On a representational model, “beliefs” and other mental states are instantiated in the form of mental content: language, images and so forth “in the head.” As I said, this is recognizably a continuation of a kind of Cartesian dualism. Indeed representational models are essentially dualistic if representations are taken to have semantic properties that are not analyzable as physical properties (this is one of a number of philosophical issues that I went into to some depth in the class). An alternative view is that psychological predicates are predicated not of brains but of whole persons.

Stomachs don’t eat lunch. People eat lunch. True enough that one needs a stomach to eat one’s lunch, but it doesn’t explain how a person eats lunch to say, “Their stomach eats lunch for them.” Brains don’t think. They don’t imagine, dream, solve problems or recognize patterns. People do those things, just as people believe, desire, hope, fear, etc. In fact, committing this mereological fallacy – the fallacy of confusing the part with the whole – obstructed our ability to learn what it is that brains actually do. We were sidetracked by the misconception that brains are little people in our heads.

“Embodied cognition” is the name given to a recent movement in cognitive science that rejects representational models of thought. The idea is that “thinking” is an activity that is distributed over the whole body. This movement has been in a particularly fertile dialectical relationship with robotics. (Not surprisingly this community has developed some excellent internet resources where students can see footage of robots in action.) It is clear enough that the future of AI lies as much with the field of robotics as with the field of computer science. What is important in an interdisciplinary context is to see the underlying, and quite old, philosophical considerations that make that clear. This also presents an opportunity to discuss the history and philosophy of science.