ARTIFICIAL INTELLIGENCE AND THE PRESERVATION OF MIND

Artificial Intelligence and the Preservation of Mind

by Ben Best

CONTENTS: LINKS TO SECTIONS

INTRODUCTORY REMARKS
ARTIFICIAL INTELLIGENCE -- OUTLINE & HISTORY
NEURAL NETWORKS IN ARTIFICIAL INTELLIGENCE AND IN BRAINS
ARTIFICIAL INTELLIGENCE AND HUMAN MINDS
FURTHER INFORMATION FROM THE WEB

I. INTRODUCTORY REMARKS

This essay was originally part of the series "The Anatomical Basis of Mind", and is now available on this website as The Anatomical Basis of Mind (Neurophysiology & Consciousness). The purpose of that series was to learn & explain what is known about how neurophysiology results in the phenomena of mind&self. Further, to understand which brain/neuronal structures must be preserved to preserve mind&self. And still further, to be able to use that knowledge to suggest & evaluate specific preservation methods.

With such a highly technical objective, readers can understand my desire to avoid the kinds of fuzzy open-ended speculation about consciousness that is unlikely to yield any tangible results. For this reason, most previous installments have delt with neurological facts. Yet, I am confronted with difficulties that undermine my resolve. Foremost is the fact that "mind&self" do not have simple technical definitions. As a cryonicist/preservationist I am forced to acknowledge that if I am to understand how neurological structures produce "mind&self", I must understand these terms in a way that I can map to the phenomena that result from neurophysiological processes. Moreover, I acknowledge that speculation is an essential ingredient in scientific research & evaluation.

Therefore, in the interest of understanding how mind&self might result from neurophysiology, this installment is devoted to investigations about how mind&self might result from computer processing. An earlier part of this series dealt with Neural Networks. I could justify this enquiry on the basis of the similarity of Neural Networks to brain operation. But the thrust of this installment will be a review of Symbolic Artificial Intelligence (AI) and the relevance of computer processing to human thought & identity.

Cryonicists are often confronted with people who claim that they have no need of cryonics because they intend to immortalize themselves by transferring their minds to computers. Such people are called either "Downloaders" or (if they believe their minds can be transferred to a platform superior to the human brain) "Uploaders". Aside from the question of whether Uploading is possible, is the question of how soon Uploading could be done. The foremost proponent of Uploading is Hans Moravec. Ten years ago Hans was 36 years old, and even at that time Marvin Minsky thought that Hans had no chance of living to the time when Uploading is possible. Just as many arrange for cryopreservation in anticipation of the time when biological aging has been eliminated, others arrange for cryopreservation in anticipation of the time when Uploading is possible.

(return to contents)

II. ARTIFICIAL INTELLIGENCE -- OUTLINE & HISTORY

Artificial Intelligence (AI) is the name that has been given to the field of computer science devoted to making machines do things that would require intelligence if done by humans (to paraphrase Minsky). Some are even ambitious enough to say that the goal of AI is to create minds in machines. But it has proven exceedingly difficult to even define words like "intelligence" and "mind", much less achieve computer implementation. The results of the 40-year history of AI have dampened the spirits of legions of would-be enthusiasts. The most "glamorous" contemporary computer applications are GUI or Java programs, not AI.

Artificial Intelligence has been divided into those concerned with processing of symbolic information and those concerned with Neural Networks. But the Symbolic AI camp has made every effort to exclude Neural Networks from the definition of AI, and the field of "Connectionism" is increasingly regarded as a somewhat independent discipline. Marvin Minsky has said that Neural Networks are "too stupid" to be considered Artificial Intelligence. Symbolic AI itself is divided into two subgroups, one primarily concerned with logic, and the other concerned with heuristics ("rule-of-thumb" methods).

Artificial Intelligence as a field dates from the summer of 1956 when a conference on the subject was organized by John McCarthy at Dartmouth College in New Hampshire. In attendance were the four "founding fathers" of AI: John McCarthy, Marvin Minsky, Herbert Simon and Allen Newell.

Herbert Simon was a political scientist who was interested in the way decisions are made in bureaucracies. He found that "management manuals" are often used by organizations to provide solutions to problems -- and he became interested in the compilation & application of the problem-solving procedures of these manuals -- a "heuristic" approach. Simon teamed-up with mathematician Allen Newell to create the world's first AI program: "Logic Theorist". Combining logic with "search tree" heuristics, Logic Theorist proved 38 of the first 52 theorems of Chapter 2 of Russell & Whitehead's PRINCIPIA MATHEMATICA. The proof for one of these theorems was even more elegant that the one given in PRINCIPIA. At the Dartmouth conference, Simon & Newell distinguished themselves by being the only participants who already had a working AI program.

John McCarthy induced the conference participants to support the term "Artificial Intelligence" to describe their discipline. Discussion centered on the idea that intelligence is based on internal representations of information, corresponding symbols, and the processing of information&symbols. In this view, intelligence transcends any specific hardware (or "wetware", as brains are often called). McCarthy believed that the key to AI was being "able to write out the rules that would let a computer think the way we do". He was one of the foremost proponents of the "logic" school, and he attempted to implement machine intelligence through predicate calculus.

McCarthy is also the creator of LISP (LISt Processing) language, the most widely-used computer language in AI research in North America. LISP is intended to reproduce what is regarded as the associative features of thought. LISP is based on Alonzo Church's "Lambda Calculus", which can apply functions to functions as readily as it can apply functions to numbers or characters. LISP programming is notoriously recursive (ie, using functions that call themselves).

Marvin Minsky is probably the most famous of the four "founding fathers" of AI. He is particularly well-known among cryonicists because he sponsored Eric Drexler's PhD thesis and wrote the forward to ENGINES OF CREATION. Minsky cofounded the MIT AI Lab with McCarthy in 1958. McCarthy left in 1962 to start AI research at Stanford, leaving Minsky to "rule the roost" in AI at MIT for many, many years.

In the early days of computing science there were, on the one side, a small group of enthusiasts with unbounded visions of what computers could do. To many, the automating of all mental activities -- and ultimately the automation of mind -- seemed only a few years away. On the other side was the general public, who (in pre-PC times) had a gross ignorance and no small amount of fear about the capabilities of computers. Managers, professionals and others feared for their jobs. IBM researchers were developing programs which could play chess, play checkers and prove geometry theorems, but the effect of this work on marketing was not seen to be favorable. IBM dropped its game-playing & theorem-proving research and adopted the posture that computers can not steal jobs because they are only moronic devices for number-crunching, data-manipulation and word-processing. IBM's pragmatic marketing position has proven prophetic in light of widespread PC use and what many people believe to be forty years of failed attempts to achieve Artificial Intelligence.

Several attempt have been made to create program "agents" in "Micro Worlds", where simple language could be applied to a limited field of discourse. One of the most ambitious of such projects was a system by Terry Winograd that represented a robot named SHRDLU (short for ETAOIN SHRDLU -- the 12 most common letters in English, in order of frequency). The robot would manipulate colored blocks and pyramids in response to English-like commands, and could give simple explanations when quizzed about its actions (like how to build a steeple). The program consisted of many fragments that acted as independent agents. This approach is more "heuristic" than logical, and it typifies Minsky's approach to AI.

In theory, a "Micro World" could be expanded continuously until the robot's "universe of discourse" overlapped the human universe of discourse -- and the robot was a genuinely sentient being. In practice, scaling-up a "Micro World" has proven exceedingly difficult. Minsky directed such a project at MIT which got bigger and bigger until there were so many programs which had been written by so many people -- and the system was so large -- that no one understood it. It was abandoned in 1971. Later, Minsky wrote THE SOCIETY OF MIND in which he attempted to decompose the human mind into a "society" of independent "agents", none of which could be called intelligent. Minsky's thesis is that this is the way that real minds work -- and it is the way that machine minds will be built.

In the early 1970s the language PROLOG (PROgramming in LOGic) became the favored AI language outside the United States (LISP was too firmly established in the US). PROLOG is a "higher-level" language that allows the programmer to specify what is to be done, rather than how, due to PROLOG's mechanisms for manipulating logical statements.

The late 1970s and early 1980s witnessed a huge surge in corporate interest & capital into the area of AI known as "expert systems". These systems consist of a knowledge base of "if-then" rules, accumulated from human experts. A car diagnosis system, for example, might contain the rule "If the engine won't turn-over, then check the battery strength". And "If the engine won't turn-over and the battery strength is high, then check the spark plugs". Expert systems incorporated the value of knowledge as well as logic in problem-solving, and had the promise of being as useful as human experts.

The first expert system was DENDRAL, a system for narrowing-down the possible chemical structure of a compound based on formula, spectral information and the encoded wisdom of chemists. DENDRAL proved its success by deducing the structure of Di-n-decyl C20H22 from 11 million possible combinations. But as the system accumulated more knowledge it became too difficult to expand and maintain. MYCIN was a better-designed expert system that more cleanly separated the rules (knowledge-base) from the logic used to apply the rules. MYCIN diagnosed infectious blood diseases on the basis of blood tests, bacterial cultures and other information. Other expert systems were developed which proved to be of benefit for certain specialized applications. But knowledge-engineering -- transferring human expert knowledge to expert computer systems -- proved to be far more difficult than anyone anticipated. The systems required continual revision to avoid obsolescence. And too often they made gross mistakes that no one with "common sense" would make. The expert system "boom" was followed by an expert system "bust".

Computer systems designed to translate from one human language to another have proven exceedingly difficult to build, for the same reason that computer systems built to understand human language have foundered. Word meanings are context-dependent. The phrase "That is a big drop" has a different meaning in an optometrist's office than it has on the edge of a cliff. A sentence like "Still waters run deep" is even more challenging, not simply because it is metaphorical, but because every word in the sentence can (1) have multiple meanings and (2) be used in more than one part-of-speech. Semantics governs syntax, which can make parsing sentences unfathomable for computer programs.

Based on the assumption that common-sense understanding necessarily must be built upon a large knowledge-base, Doug Lenat has begun a $25 million project to create a computer-system with an enormous amounts of knowledge. Cyc (short for encyclopedia) collects knowledge in "frames" (collections of facts & rules) in a feeding process that is expected to take two person-centuries, and include 100 million frames. Cyc is capable of "meditating" on its knowledge, searching for analogies. Lenat claims that Cyc will eventually be able to educate itself (by reading and having discussions with "tutors") rather than having to be "spoon fed" its knowledge. But McCarthy thinks Cyc has inadequate logic, and Minsky doesn't think Cyc has a wide-enough variety of procedures to do much with its information.

The Artificial Life community has expanded on the "Micro World" concept by creating Micro Worlds in which the artificial creatures experience "pleasure & pain", have needs and must adapt. In these "Micro Worlds" there is not only a universe of discourse, but a universe of values (motives, "feelings"). In S.W. Wilson's Animat system, the creatures adapt by learning what rules "work" and what rules don't. Genetic algorithms provide a basis for "learning".

AI "founding father" Allen Newell has designed the SOAR system, a collection of psychological mechanisms based on his book UNIFIED THEORIES OF COGNITION (1990). SOAR uses representations and ends-means analysis for problem-solving. A "problem" is, by definition, a discrepancy between an existing condition and a goal condition. As with the Artificial Life systems, SOAR presumes that an intelligent system must be "motivated" by goal-states -- and the sophisticated heuristic search strategies of ends-means analysis are used to achieve those states. "Learning" occurs in the process.

(return to contents)

III. NEURAL NETWORKS IN ARTIFICIAL INTELLIGENCE AND IN BRAINS

There has commonly been a disdainful attitude among the Symbolic AI community towards neural networks. Consequently, neural networks have either been referred to as "subsymbolic AI" or dismissed as not being AI at all. Marvin Minsky in personal conversation told me that neural networks end-up in local minima or, if they are effective, operate on too low a level to be of interest to AI. I think Marvin may be in a local minima in his perception of neural networks.

I believe that to understand how neurophysiology produces mind&self means understanding the operation of neurons, networks of neurons, and how specific areas of the brain contribute to overall function. I believe that for complete understanding there must be no "missing links". And I believe that the study of neural networks on computers ("Connectionism") can contribute to an understanding of how networks of neurons operate in the brain. So I want to expand on the discussion of neural networks that appeared in Part 5 of this series by explaining in greater detail what Connectionism implies about brain operation and ("therefore") mental function.

A large portion of those studying Symbolic AI have backgrounds in "computer science", and this has become increasingly true as the field has matured. By contrast, a large portion of Connectionists have backgrounds in mathematics, statistics and physics. Many Connectionist models seem so abstracted from neurophysiology as to seem irrelevant, but I want to explain why this is not always true.

[SYNAPSE CHEMCIAL EVENTS]

The key features of neurons, as modeled by neural networks are the cell bodies (called "neurons", and serving as input/output devices in the models), axons (lines of connection) and synapses (the "weights" on the lines leading to a neuron). An action potential is generated by a neuron when it "fires", the action potential travels down the axon and causes a calcium-mediated release of a neurotransmitter chemical from vesicles at the pre-synaptic membrane. These chemicals cause a change in the potential of the post-synaptic membrane which, if great enough, results in the depolarization ("firing") of the post-synaptic neuron.

This model makes it sound as though neuron interaction is strictly "all-or-none". If this were the case, then inputs could only be represented in a boolean manner (as 0 or 1) depending on whether an action potential is received at a particular synapse or not. Similarly, outputs seem boolean insofar as a post-synaptic neuron either "fires" or does not. But, in fact, a presynaptic neuron may send a volley of action potentials down an axon resulting in a much greater release of neurotransmitter at the synapse. And the postsynaptic neuron may fire many times in sequence as a result. So the presynaptic input might not be represented as simply 0 or 1, but also as 2, 3, 4 or some number in between. And the same can be said of the output of the postsynaptic neuron. The strength of the synapse can also vary, depending on the size of the synapse and the number of neurotransmitter vesicles it contains. This strength, and the input itself, can be construed as positive or negative because the neurotransmitter could be stimulatory or inhibitory -- as can be the neurotransmitter released by the output neuron. Thus, inputs, weights and outputs for real neurons can be interpreted as having a range of positive and negative values, not simply boolean values.

[McCULLOCH-PITTS NEURON]

The first neuron model for neural networks was the boolean neuron of McCulloch and Pitts. As depicted, this neuron, with a threshold (Ø) of 1, can act as an inclusive-or logic gate. With weights assumed to be 1, inputs from either or both axon a and axon b will cause the neuron to fire and produce an output of 1. Only if both axons are quiet (0), will the neuron not fire.

It is often said that we only use 5-10% of our brain's capacity, but this homily is probably unfounded. The nervous system makes up 1-2% of the body's weight, and yet consumes 25% of the body's energy. A fair amount of energy is required to maintain neuron membrane potentials, even when neurons are inactive. Therefore, it is unlikely that much metabolic energy is wasted supporting a fictitious 90-95% of unused brain potential. Neurons must earn their keep and, in fact, massive cell death of unused neurons is a well-known feature of the development of the nervous system. Even "forgetting" is important, because it would hamper brain function to retain and have to sift-through vast amounts of useless information.

In light of these considerations, the synaptic learning mechanism postulated by psychologist Donald Hebb makes some sense. Hebb claimed that a synapse is strengthened when there is simultaneous activity of a presynaptic neuron and the post-synaptic neuron to which it connects (and is releasing neurotransmitter). Such a synapse is called "Hebbian", and the mechanism is called Hebbian learning. This makes intuitive sense, insofar as a synapse that causes a postsynaptic neuron to fire is an effective, and therefore useful synapse. Conversely, a synapse where action potentials repeatedly arrive without any result on the postsynaptic cell is a waste of energy, and would best be "forgotten".

[TWO-LAYER NEURAL NETWORK]

A single-layer feedforward neural network is called a linear associative memory. The network associates an output pattern (vector) with an input pattern (vector). Each connection has a weight (synaptic strength). A network with 4 inputs and 4 outputs can represent the weights as a 4 X 4 matrix. For example:


             0.5  -0.5   0.5  -0.5
             0.5  -0.5   0.5  -0.5
            -0.5   0.5  -0.5   0.5
            -0.5   0.5  -0.5   0.5

where each row corresponds to the synaptic strengths of all the connections from all the input neurons to a single output neuron. If a pattern of input strengths appears, to determine the output of a neuron, one must multiply each input strength times the weight (connection strength) of each connection leading to that output neuron -- and sum the result. For example, the net input-to (and output-from the first neuron with input pattern (0.5,-0.5,0.5,-0.5) would be (0.5 X 0.5) + (-0.5 X -0.5) + (0.5 X 0.5) + (-0.5 X -0.5) = 0.25+0.25+0.25+0.25 = 1. Thus, the pattern on the output layer would be (1,1,-1,-1). Mathematically, this is the inner product of the weight matrix and the input pattern. But the weight matrix given is not arbitrary, it is the outer product of the output pattern and the input pattern. The formation of this weight matrix from the outer product of the output vector and the input vector is called an example of Hebbian learning, because each synaptic strength (weight) formed is a product of both an input neuron and an output neuron. The size of the output vector can be doubled, tripled or quadrupled by adding the weight matrix to itself once, twice or three times.

The outer product of two vectors is similar to the inner product, except that the numbers in the rows are not summed after multiplication. For example, the outer product of the vectors (1,2,3) and (2,3,1) is:

                      2 3 1
                      4 6 2
                      6 9 3

[GEOMETRIC VECTOR INNER PRODUCT]

The inner product operation of mathematics is the key calculation done by a neural network -- specifically, the inner product of a vector of weights with a vector of inputs. Imagine a network with 2 neurons with inputs represented by the vector (3,1) and weight vector (2,3). The net output of the neuron will be (3X2)+(1X3)=9. The geometric interpretation leads to the conclusion that, for vectors of unit length, the inner product will be largest for vectors that point most closely in the same direction in vector space.

In a biological neural network this makes intuitive sense: if the inputs to a neuron are strongest on those axons with the greatest synaptic strength, then the net input to that neuron will be the strongest. This would likely lead to a high neuron output (high firing frequency, or a firing frequency of long duration) compared to other neurons with weaker inputs or synaptic strengths. This is the key to learning in competitive ("winner-take-all") networks. The output neuron with a pattern of weights most resembling the input pattern will fire most strongly -- and this neuron is made to inhibit the firing of neurons around it ("lateral inhibition"). Synaptic strengths of the winning neuron are strengthened in a Hebbian manner -- resulting in the winning neuron becoming the "recognizer" of the specific input pattern. Many of those studying neural networks to gain insight into brain function claim that competitive networks are more biologically realistic than any other neural network model. In fact, competitive learning seems to be precisely what is occurring when the ability to make visual discriminations is learned during development (see CCN 23, page 23).

[THREE-LAYER NEURAL NETWORK]

By contrast, the backpropagation algorithm (used in 80% of commercial neural network applications) is regarded as being biologically unrealistic. A backpropagation neural network is a multi-layered feedforward network with at least one layer of "hidden" neurons between the input and the output -- which allows for the recognition of non-linear associations between input and output. A matrix of synaptic weights can be seen corresponding to the connections between each layer. An input pattern propagates forward through the network producing an output pattern. The output pattern is compared with the "target pattern" and the difference between the output and target is calculated for each output neuron. This "error" is then propagated backwards for adjusting first the weights leading to the output layer, and second the weights leading to the hidden layer. Because a target pattern is compared to the output pattern, the learning this network performs is called supervised learning -- in contrast to the unsupervised learning of competitive networks.

[BACKPROPAGATION DYNAMICS]

Backpropagation has been called biologically unrealistic because: (1) the neurons it uses have graded responses rather than "all-or-nothing" thresholds, (2) the backpropagation of the error is not seen to correspond to the operation of real neurons and, (3) backprop uses an "external supervisor" to provide the target pattern. The ability of neurons to give graded responses by strength and duration of firing frequency refutes point (1). Point (2) could be explained either by a Hebb-like mechanism or by an interpretation whereby neuron groups are actually the functional units of biological neural networks, rather than individual neurons. In fact, almost every group of fibers leading from one nucleus or area of the brain to another nucleus or area is associated with a group of fibers going the opposite direction. As for point (3), target patterns could be provided by other networks -- as when we compare the results of our actions with our desired results. Ultimately, our standards of evaluation probably has a genetic basis. To quote the psychologist William James: "Dangerous things fill us with involuntary fear; poisonous things with distaste; indispensible things with appetite. Mind and world, in short, have been evolved together, and in consequence are something of a mutual fit."

[3-D MULTIVARIATE STATISTICS]

To mention briefly the interest neural networks have aroused among statisticians, imagine that an input vector of (10,30,61) corresponds to the age, height in inches, and weight in pounds of a child. Finding the relationship between these inputs and some set of outputs (incidence of disease, athletic performance, etc.) for a large number of children is of interest in multivariate statistical analysis. A covariance matrix between inputs and outputs (outer product) is a statistical expression of Hebb's postulate. The principal components (eigenvectors) of this the vector space defined by the matrix are of interest because they indicate the variables with the greatest association between input and output. Backpropagation can perform nonparametric principal component analysis nonlinearly -- something beyond the capacity of conventional mathematics when a large number of variables are involved.

Unlike conventional computer programs, neural networks are "trained", not programmed to respond the way they do. Repeated presentations of the patterns to be recognized are made -- with weight (synapse) adjustments made each time -- until the network "settles" with weights that match the inputs with the outputs. The "hidden layer" of neurons between the input neurons and output neurons allows for very complex (nonlinear) mappings between inputs and outputs. For example, input patterns could consist of handwritten letters and output patterns could be ASCII codes. The network learns by "induction", rather than by being explicitly "told". In recognizing handwritten letters, a neural network demonstrates the ability to "generalize" by creating an internal "prototype" representation of each letter -- distributed through the weights. Understanding the relationship between the weights and the patterns, however, can be very difficult.

[3-D SURFACE]

The 3-dimensional figure shown represents two neurons in an input layer (the two horizontal axes) and a single output (the vertical axis) generated by a complex function of neurons in the hidden layer. More often, however, the vertical axis represents the "error" between the desired function and the functional result produced by the network. The network reduces the error by adjusting the weights -- and this process is analogous to a "ball" rolling into one of the points of minimal height. Backpropagation has been criticized because of the presumed difficulty of reaching the lowest possible point -- because of being stuck in a local minimum. In practice, however, the problem of local minima has not proven to be so great. With many neurons -- a multi-dimensional space of many dimensions -- the "ball" may reach local minima in several dimensions, but still find a dimension in which it can continue to roll downward.

A neural network can learn to recognize handwritten letters very well, even if the final "synapse" weights are not the best possible. Any claim that human intelligence is capable of perfect knowledge ("global minima") and of avoiding erroneous generalizations is dubious. In fact, many of the errors seen in artificial neural networks resemble those commonly made by children. For example, a network learning to conjugate verbs may use the word "goed" rather than "went". Nature may assist in the process by genetically providing us with initial synaptic weights that are more likely to converge rapidly to an optimal solution, and avoid a high local minimum.

A neural network approach to AI has been called a "bottom-up" approach -- in contrast to the "top-down" approach of symbolic AI, in which observed mental functions are broken into parts and codified to executable computer programs. Neural networks rely on the "emergence" of "cognition" through the interaction of large numbers of sample components in unpredictable and complex ways. Often it is not possible to explain why a neural network produces the result it does. Symbolic AI proponents can be extremely critical of the supposition that intelligence can emerge from a Connectionist scheme. Minsky has said that such an approach goes "against the grain of analytic rigor". Poggio has used even stronger language, equating the idea of spontaneous emergence of intelligence from a neural network with reliance on "magic".

Nonetheless, neural networks seem to convincingly mimic both brain action and brain function. Neural networks, like brains, are very adept at pattern recognition -- something which has proven to be a very tough problem for Symbolic AI and conventional computational methods. Certain brain areas -- like the visual cortex -- go through a stage of high plasticity (like a training network) in the learning of visual discrimination, followed by a stage of lesser flexibility once basic patterns are learned. Some brain-networks are undoubtedly adaptable only within hard-wired constraints -- as with those associated with motor control. Too much adaptability can mean instability and even loss of information. Insofar as we imagine our identity to be stable, it is an intriguing question to ask how closely our identity is associated with the plastic, adaptable portions of our brains.

(return to contents)

IV. ARTIFICIAL INTELLIGENCE AND HUMAN MINDS

In 1950 Alan Turing wrote an article for the journal MIND in which he suggested a test wherein an interrogator would submit teletyped questions to both a machine and a person -- and receive teletyped answers from both. Turing claimed that if the interrogator could not correctly distinguish between the person and the machine, the machine deserved to be called intelligent. Turing believed that by the year 2000, the average interrogator would be unable to determine the human more than 70% of the time in less than 5 minutes.

Despite the flaws in Turing's approach, the "Turing Test" became a focal point of discussions about machine intelligence. It avoids the necessity of defining intelligence, and relies on appearances to an "average" interrogator -- under a time limit. And it places a great emphasis on a machine being able to resemble a human.

In a conscious effort to expose the irrelevance of deceptive appearances, Joseph Weizenbaum wrote a program in the mid-1960s called ELIZA, which simulates a nondirective psychotherapist. If a user would type-in "My mother is afraid of cats", the program might respond with the phrase "TELL ME MORE ABOUT YOUR FAMILY". To the phrase "He works with computers" it might respond "DO MACHINES FRIGHTEN YOU?". If the user entered a sentence that triggered no pre-defined answer, it might respond with "GO ON" or "EARLIER YOU TALKED OF YOUR MOTHER". Ironically, Weizenbaum made every effort to make ELIZA convincing, but was irritated to learn that there were people who seriously interacted with ELIZA by confiding their problems. Weizenbaum believed that human intuition & wisdom are not "machinable", and he even objected to the goals of AI on moral grounds.

In 1980 philosopher John Searle wrote a paper describing a thought experiment of a person passing the Turing Test in Chinese. In this scenario, an English-speaking person who is ignorant of Chinese would sit in a room in which he/she would receive messages written in Chinese. Detailed scripts would describe what responses to provide. Even though the person in the "Chinese Room" might convince the Chinese interrogator that someone in the room understood Chinese, all that occurred in reality was symbol manipulation. Searle claimed that this is all computers ever do or can do -- manipulate symbols without any real understanding of what those symbols mean.

Searle believes that computers will eventually be able to perform every intellectual feat of which humans are capable -- and pass every Turing Test with ease -- yet still be lacking in subjective consciousness. He believes that 2 different "systems", one conscious and the other unconscious, could produce identical behavior. Searle's position has been called weak AI, and it is contrasted with strong AI: the idea that intelligent machines will eventually possess consciousness, self-awareness and emotion. (Roger Penrose denies the possibility of both strong AI and weak AI.) Searle characterizes the "strong AI" position as believing that "the mind" is "just a computer program". Others have characterized the difference between "weak AI" and "strong AI" as the difference between a "third-person" approach to consciousness and a "first-person" approach.

Before computers, many people undoubtedly believed that arithmetic operations or the symbolic manipulations of integral calculus required intelligence. Some people still worry about the fact that computer chess-playing machines may soon be able to defeat all human opponents. But the limitations on this capability are nothing more than those of hardware to exhaustively search the consequences of millions of board positions in a short time. Is the ability of a computer to defeat any human opponent at chess any more a sign of intelligence than the ability to calculate the square root of 2 to hundreds of decimal places in less than a millisecond?

The Church-Turing Thesis essentially asserts that any algorithm can be implemented on a computer. If we can introspectively reduce any intellectual feat to a step-by-step procedure (an algorithm), then a computer can implement it. Yet computers have had the most difficulty with things that humans do without conscious thought, such as recognizing faces. Much problem-solving actually involves pattern-recognition -- intuitive associations of problems with similar problems. The difficulty of experts to reduce this ability to algorithms has been one of the obstacles to implementing expert systems. Yet neural networks may eventually be capable of handling some of these tasks.

Much of what passes for creativity is simply pseudo-random association. Douglas Hofstadter wrote a program that uses Recursive Transition Network grammar and words randomly selected from word-categories to produce surrealistic prose: "A male pencil who must laugh clumsily would quack". My program produces phrases like: "Your eye intrigues upon delight" (my word-lists are different). A careful analysis of so-called creative thought often reveals procedures like "find extreme examples" or "invert the situation". The presumption that computers "can only do what you tell them to do" seems contradicted by the fact that game-playing programs frequently can defeat their creators.

Deciding whether a computer program possesses intelligence should require a clear definition of the term "intelligence" and a means to measure it. An intelligent system should be able to form internal representations of the external world, be able to learn and be able to reason ("think") about the world and itself. So-called "IQ tests" exist for human beings, but people are reluctant to apply the same standards to machines. Descriptions of what people would call intelligence might include "common sense" plus a general ability to analyze situations and to solve problems. Often humans are called intelligent even though their abilities exist in specialized areas -- and they are not seen as having "common sense". But the same allowances are not made for computers. Within humanity there is a vast range of intellectual abilities between infants&adults -- or between the genius and the mentally-retarded or brain-damaged. I suspect that even so-called "third-person" criteria for intelligence covertly demand signs of subjective consciousness.

What are the signs of consciousness? Searle argued that his Chinese Room example demonstrated that it is possible to effectively manipulate symbols without understanding their meaning. Do awareness, understanding and meaningfulness require subjective consciousness? Does a spider know how a web is created? Does a beaver consciously build a dam? A frog may be aware of its surroundings, but is it conscious or intelligent? Self-awareness in the presence of a mirror has been demonstrated by chimpanzees, but not by baboons. Yet other animals may possess some form of self-awareness insofar as carnivores do not attempt to eat themselves (although this may be just a response to pain). Using self-awareness as a criterion for intelligence may not be more defensible than "common sense". Minsky has defined self-awareness as the ability to monitor one's own processes. He believes that computers could easily monitor their own processes and explain their actions more effectively than human beings.

Pattern-recognition, purpose and emotion are said to be attributes of intelligence or consciousness which cannot be reduced to algorithms. Nonetheless, neural networks can achieve pattern recognition. And computers can have a purpose or goal -- to win a game of chess, for example. It might be objected that the goals of computers are only those that have been programmed -- but are not the goals of humans & animals those which have been programmed into their genes (pleasures, pains, fears, etc.)? Certain people with neurological deficits have been incapable of feeling pain. Must this reduce their consciousness or intelligence?

And what is emotion? Hans Moravec says that emotions are just drives for channeling behavior -- they focus energy around goals. He imagines robots that would get a "thrill" out of pleasing their owners -- and exhibit "love". The robots could become upset when their batteries run low and would plead with their owners (or express anger) for a recharge. Does emotion require biological organs? Are other forms of feeling possible with other hardware, perhaps even silicon?

The undue emphasis on "humanness" of the Turing Test is a severe limitation as a criterion for either intelligence or consciousness. Dolphins have a larger cerebral cortex than human beings, and appear to communicate with each other. If they do possess intelligence, or consciousness, it is alien to that of a human. Aside from the fact that they don't manipulate their environment with tools, their perceptive apparatus and natural environment is somewhat different:they place more emphasis on sound (echolation) and less on vision. Bats rely on echolation even more than dolphins, and a bat with an expanded brain would doubtless have a very different understanding of the world from a human. An even more extreme example would be an organism with an expanded brain that relies primarily on taste&smell to perceive the world. But the alienness of machine intelligence, consciousness & feeling could be even more extreme. Airplanes are not built to flap their wings, but they fly more effectively than birds. The intelligence, consciousness & feelings of machines might be too remote from our own experience for us to identify with them.

If it is true that a computer can only manipulate symbols without any real understanding, how do human beings achieve understanding? Books are full of knowledge, but are not conscious or intelligent. Computers don't simply manipulate information, they do so in a goal-directed, problem-solving manner. If knowledge is used to solve problems, isn't that an indication of knowledge being meaningful? Symbols can only be meaningfully manipulated if they are meaningfully representative of something real. Neural networks are said to recognize patterns because they give distinctive outputs corresponding to distinctive input patterns. A computer that does integral calculus gives a distinct solution for a distinct input equation. Can an operational definition be given to "understanding", or is it to be defined only relative to a subjective consciousness? What "magic" is performed by a biological brain that cannot -- in theory -- be performed by a computer? Why should it be necessary to build machines from organic materials to achieve consciousness?

Symbolic AI adherents have been harshly critical of the idea that intelligence can "emerge" from a neural network. But the Cyc project's attempt to accumulate a large enough knowledge base for a computer to acquire "common sense" seems itself like it is based on "emergence". A dictionary in itself is meaningless to a person because all of the words are defined in terms of other words. A human being associates words with objects in the environment that themselves acquire meaning through their ability to induce an emotional/physiological response (like pleasure or pain). The meaning of a kiss involves both the physical features of the kiss and emotional reactions associated with who is kissing and being kissed. Ultimately, meaning seems like another word for "qualia" -- but there is also the question of context. Words have meaning not only by virtue of their association with objects, but by association with other words -- and some words function only to link words.

Is "meaning" an emergent phenomenon -- the product of a "critical mass" of knowledge (or of something else?)? If emergence occurs, there can be degrees of meaning, both for individual words and for the entire context of understanding. As with the way the phenomenon of heat "emerges" from molecular motion -- or a traffic jam emerges for an increasing number of vehicles.

Marvin Minsky believes that a mind can be created only by analyzing "mind" into its component parts ("agents"), none of which can be intelligent themselves. To quote Minsky: "Unless we can explain the mind in terms of things that have no thoughts or feelings of their own, we'll only have gone around in a circle." But Minsky's conception of the mind as a "society" of unintelligent agents still sounds suspiciously like an "emergent" phenomenon. I say this to emphasize the "magicalness", although I don't deny that emergent phenomena exist. Connectionists seem to assume that intelligence or consciousness will emerge from a machine that simulates brain processes well-enough (reverse engineering), while Symbolists seem to assume that a machine will become conscious or intelligent if it is given enough knowledge or problem-solving capacity. If a mind cannot be created from increasingly better simulations of mind or brain, why can't it?

In the case of subjective conscious there seems to be the serious problem of objective proof. The only subjective reality we can know about directly is our own. We conclude that other people have feelings because we can identify with their behavior. A computer need only be a good actor to pass the Turing Test -- but then, perhaps, the same can be said of other people. Why should it necessarily be true that all humans have subjective consciousness? There is simply no way to objectively prove that a machine, an animal or even another human being has subjective consciousness. The question is unfalsifiable and therefore does not even qualify as a scientific question.

More seriously, how could an Uploader feel confident that a change from wetware to hardware would not involve a loss of subjective consciousness? If it is not possible to distinguish a system having consciousness from one that does not when the behavior of the two systems is identical, how could an Uploader be confident of not losing "self" in the process? A hardware environment may appear to provide a cozy new home for the self -- but after the Uploading, all that might be left of the person would be a simulation of the person. Even for those who do not choose to do a full upload, there could be increasing pressures to augment one's intelligence with computer-chip add-ons to our biological brains. Those who do not do this will be "left behind" by those who do. Would these add-ons augment the "self", or is there a danger of a loss of identity/subjective consciousness? Could the testimony of others be trusted?

Most seriously, how can cryonicists be sure that reconstruction by nanotechnology -- or even reconstruction from a perfectly vitrified state -- would not result in a simulation of ourselves (or even a very close clone) rather than a reestablishment of our subjective selves? All of our carefully-kept introspective diaries may be little more than additional information on how to produce a more convincing simulation.

But I will not end on such a cynical note. I do not believe in "magic". I believe that subjective consciousness is entirely the product of the material brain, and that to perfectly reconstruct that brain can only result in a perfect reconstruction of subjective consciousness. Even if there can only be subjective verification of subjective states, those states can only be the product of objective material and objective processes. We are, individually (though not "scientifically"), in a position to observe both objective phenomena and subjective phenomena -- and to correlate the two. Even now we can correlate PET scans with subjective processes. We may eventually learn to correlate our subjective experience with the anatomical basis of mind -- and this could be the key to our survival.

(return to contents)

V. FURTHER INFORMATION FROM THE WEB

For more on Artificial Intelligence see AI Topics (American Association for Artificial Intelligence (AAAI) .

[GO TO BEN BEST'S HOME PAGE] HOME PAGE