The Turing test is a test of a machine’s ability to exhibit intelligent behaviour. In the original illustrative example, a human judge engages in a natural language conversation with a human and a machine designed to generate performance indistinguishable from that of a human being. All participants are separated from one another. If the judge cannot reliably tell the machine from the human, the machine is said to have passed the test. The test does not check the ability to give the correct answer; it checks how closely the answer resembles typical human answers. The conversation is limited to a text-only channel such as a computer keyboard and screen so that the result is not dependent on the machine’s ability to render words into audio.
The test was introduced by Alan Turing in his 1950 paper “Computing Machinery and Intelligence,” which opens with the words: “I propose to consider the question, ‘Can machines think?'” Since “thinking” is difficult to define, Turing chooses to “replace the question by another, which is closely related to it and is expressed in relatively unambiguous words.”
Turing’s new question is: “Are there imaginable digital computers which would do well in the imitation game?” This question, Turing believed, is one that can actually be answered. In the remainder of the paper, he argued against all the major objections to the proposition that “machines can think”.
In the years since 1950, the test has proven to be both highly influential and widely criticized, and it is an essential concept in the philosopy of artificial intelligence.
Saul Traiger argues that there are at least three primary versions of the Turing test, two of which are offered in “Computing Machinery and Intelligence” and one that he describes as the “Standard Interpretation.” While there is some debate regarding whether the “Standard Interpretation” is that described by Turing or, instead, based on a misreading of his paper, these three versions are not regarded as equivalent, and their strengths and weaknesses are distinct.
Huma Shah points out that Turing himself was concerned with whether a machine could think and was providing a simple method to examine this: through human-machine question-answer sessions. Shah argues there is one imitation game which Turing described could be practicalised in two different ways: a) one-to-one interrogator-machine test, and b) simultaneous comparison of a machine with a human, both questioned in parallel by an interrogator. Since the Turing test is a test of indistinguishability in performance capacity, the verbal version generalizes naturally to all of human performance capacity, verbal as well as nonverbal (robotic).
Turing’s original game described a simple party game involving three players. Player A is a man, player B is a woman and player C (who plays the role of the interrogator) is of either sex. In the Imitation Game, player C is unable to see either player A or player B, and can communicate with them only through written notes. By asking questions of player A and player B, player C tries to determine which of the two is the man and which is the woman. Player A’s role is to trick the interrogator into making the wrong decision, while player B attempts to assist the interrogator in making the right one.
Sterret referred to this as the “Original Imitation Game Test”. Turing proposed that the role of player A be filled by a computer so that its task was to pretend to be a woman and attempt to trick the interrogator into making an incorrect evaluation. The success of the computer was determined by comparing the outcome of the game when player A is a computer against the outcome when player A is a man. Turing stated if “the interrogator decide[s] wrongly as often when the game is played [with the computer] as he does when the game is played between a man and a woman”,[19] it may be argued that the computer is intelligent.
The second version appeared later in Turing’s 1950 paper. Similar to the Original Imitation Game Test, the role of player A is performed by a computer. However, the role of player B is performed by a man rather than a woman.
“Let us fix our attention on one particular digital computer C. Is it true that by modifying this computer to have an adequate storage, suitably increasing its speed of action, and providing it with an appropriate programme, C can be made to play satisfactorily the part of A in the imitation game, the part of B being taken by a man?”
In this version, both player A (the computer) and player B are trying to trick the interrogator into making an incorrect decision.
Common understanding has it that the purpose of the Turing Test is not specifically to determine whether a computer is able to fool an interrogator into believing that it is a human, but rather whether a computer could imitate a human. While there is some dispute whether this interpretation was intended by Turing — Sterrett believes that it was and thus conflates the second version with this one, while others, such as Traiger, do not this has nevertheless led to what can be viewed as the “standard interpretation.” In this version, player A is a computer and player B a person of either sex. The role of the interrogator is not to determine which is male and which is female, but which is a computer and which is a human. The fundamental issue with the standard interpretation is that the interrogator cannot differentiate which responder is human, and which is machine. There are issues about duration, but the standard interpretation generally considers this limitation as something that should be reasonable.
Controversy has arisen over which of the alternative formulations of the test Turing intended. Sterrett argues that two distinct tests can be extracted from his 1950 paper and that, pace Turing’s remark, they are not equivalent. The test that employs the party game and compares frequencies of success is referred to as the “Original Imitation Game Test,” whereas the test consisting of a human judge conversing with a human and a machine is referred to as the “Standard Turing Test,” noting that Sterrett equates this with the “standard interpretation” rather than the second version of the imitation game. Sterrett agrees that the Standard Turing Test (STT) has the problems that its critics cite but feels that, in contrast, the Original Imitation Game Test (OIG Test) so defined is immune to many of them, due to a crucial difference: Unlike the STT, it does not make similarity to human performance the criterion, even though it employs human performance in setting a criterion for machine intelligence. A man can fail the OIG Test, but it is argued that it is a virtue of a test of intelligence that failure indicates a lack of resourcefulness: The OIG Test requires the resourcefulness associated with intelligence and not merely “simulation of human conversational behaviour.” The general structure of the OIG Test could even be used with non-verbal versions of imitation games.
Still other writers have interpreted Turing as proposing that the imitation game itself is the test, without specifying how to take into account Turing’s statement that the test that he proposed using the party version of the imitation game is based upon a criterion of comparative frequency of success in that imitation game, rather than a capacity to succeed at one round of the game.
Saygin has suggested that maybe the original game is a way of proposing a less biased experimental design as it hides the participation of the computer. The imitation game also includes a “social hack” not found in the standard interpretation, as in the game both computer and male human are required to play as pretending to be someone they are not.