As stated in the previous section, Thorndike’s research on animal learning in the years before and after the turn of the twentieth century had an enormous influence on the direction taken by experimental psychology. Thorndike’s focus on publicly observable phenomena — external situations and the responses learned in those situations — influenced experimental psychology’s eventual transformation from the science of the conscious mind to the science of behavior.
In Section 4-6, you were introduced to some of the ideas of the “radical behaviorist,” B. F. Skinner. Skinner (1953, 1974) continued Thorndike’s work on instrumental learning but renamed it operant conditioning to emphasize that individuals learn new responses that “operate on” the environment — their learned responses manipulate the environment in ways that allow them to experience rewards and punishments (Schick, 1971; Skinner, 1937). In Thorndike’s puzzle-box experiments, for example, the cats’ learned instrumental responses “operated on the environment” in ways that allowed them to escape from the small enclosure and to experience the sight, smell, and taste of food.
Here’s a video of a rat in a Skinner box:
Skinner enjoyed building mechanical devices to use in his research (Bjork, 1993), and he developed what now are informally referred to as Skinner Boxes (see Figure 1). Skinner Boxes are fully automatic conditioning devices: an animal (usually a rat or pigeon) is placed inside the box and learns a response — a rat typically presses a lever, a pigeon typically pecks a key — in order to receive stimuli such as food or water. The lever-press or key-peck leads to the consequence, however, only when preceded by a light, tone, or other sensory stimulus. This antecedent stimulus — the stimulus that comes before the response — indicates that the behavioral response is likely to be followed by a consequent stimulus — the stimulus that comes after the response. Presentations of the antecedent stimulus, the recording of responses, and presentations of the consequent stimulus are all mechanized and, therefore, Skinner and his associates did need not be present. The general operant-conditioning procedure is illustrated in Figure 2.
In operant conditioning, the learned response is called the operant response. The pulling of a wire loop in Thorndike’s puzzle box (Thorndike, 1898, 1911) and the pressing of a lever in a Skinner Box are examples of operant responses: they are responses to the antecedent stimulus and either increase or decrease in frequency over time depending on the nature of the consequent stimulus. A consequent stimulus that strengthens the operant response it follows is called a reinforcement. The food that Thorndike’s cats ate after pulling the wire loop and the water that Skinner’s rats drank after pressing the lever are examples of reinforcements. A consequent stimulus that weakens the operant response it follows is called a punishment. Rats that previously learned a lever-press response, for example, might now receive an electric shock after pressing the lever, which would cause them to reduce their lever-pressing over time. The electric shock, in this example, would be a punishment.
The antecedent stimulus is called the discriminative stimulus, and is defined as a cue that signals the probable consequence of an operant response — that is, it signals whether the operant response will be reinforced or punished. In a Skinner Box, the discriminative stimulus might be a light that, when turned on, indicates that a lever press is likely to be followed by a reinforcement or punishment. Figure 3 uses these terms to illustrate the general operant-conditioning procedure (compare to Figure 2 above).
Let’s look at some more examples of operant conditioning in order to help you learn how to apply these terms to actual learning situations. (Note: Skinner would not have identified some of my examples as reinforcements and punishments because they refer to mental events; e.g., “pleasurable feelings.)
For many people, drinking alcohol often is followed by pleasurable feelings or by relief from anxiety. This is an example of operant conditioning: a voluntary behavior (an operant response) has consequences that lead either to an increase or decrease in the behavior. In this example, what is the discriminative stimulus, the operant response, and the consequence (reinforcement or punishment)? The answers are provided in Figure 4.
In bungee jumping, a person jumps off a tower (or some other high place) while connected to elastic cords. Again, this is an example of operant conditioning: a voluntary behavior has consequences that lead either to an increase or decrease in the behavior. In this example, what is the discriminative stimulus, the operant response, and the consequence (reinforcement or punishment)? Think about this example for a minute before looking at the answers in Figure 5.
Figure 5 shows that there are at least two answers. In each answer, the discriminative stimulus is the sight of the tower, and it includes any other stimuli that immediately precede (and trigger) the jump. The operant response is jumping off the tower. The consequence, however, depends on the person. The consequence for some people will be reinforcing, whereas for others, it will be punishing. The best way to tell whether it is reinforcing or punishing is to look at what happens to the operant response over time. If the person shows increased bungee-jumping in the future (regardless of what he or she tells us after the experience), we know that the operant response has been reinforced. If the person shows decreased bungee-jumping in the future (again, regardless of what he or she says), we know that the operant response has been punished. Individual differences in what is learned in a particular situation depend on whether the consequent stimuli are reinforcing or punishing (or neither) for individuals.
A variety of factors determine whether a person finds a stimulus to be reinforcing or punishing: physiological factors, past experiences, one’s current mood, etc. For example, eating food is reinforcing unless one has just finished a very large meal: in this case, eating more food probably will be punishing. A person who usually becomes ill when she drinks alcohol will find that doing so is punishing rather than reinforcing. To repeat, the only way to determine whether an individual is reinforced or punished by a consequence is to see whether the operant response increases or decreases in frequency over trials.
Let’s look at one last example of individual differences in the reinforcing/punishing effects of consequent stimuli. We know that most people learn to stop performing behaviors that cause pain because the pain is punishing. For instance, if you see a pan sitting on top of a red-hot burner on an electric stove, you are unlikely to touch the inner surface of the pan perhaps because, as a child, you were burned when you did so. The operant response of touching the pan was punished by the resulting pain and, hence, decreased in frequency, most likely very quickly. Figure 6 illustrates the operant conditioning that led to your not touching the inner surfaces of hot pans.
There is a rare medical condition in which people are born unable to feel pain (Brownlee, 2006). A person with this problem would be unable to learn to not touch hot pans because she would be unable to feel pain after doing so. Because there is neither punishment nor reinforcement for touching hot pans, no operant conditioning can occur. In fact, people with this disorder often suffer serious injuries, and even death, because they can’t be operantly conditioned not to perform behaviors that result in pain. For example, a group of researchers who performed a genetic study of this disorder (Cox, et al., 2006) first learned of three northern Pakistani families with a high incidence of the disorder after hearing about a member of one of the families, a 10-year-old boy, who had come to an untimely end:
[He was] well known to the medical service after regularly performing ‘street theatre’. He placed knives through his arms and walked on burning coals, but experienced no pain. He died before being seen on his fourteenth birthday, after jumping off a house roof. (Cox, et al., 2006, p. 894)
The researchers’ description of injuries suffered by those family members participating in the study also makes very clear the costs to people born with the disorder:
All six affected individuals had never felt any pain, at any time, in any part of their body. Even as babies they had shown no evidence of pain appreciation. None knew what pain felt like, although the older individuals realized what actions should elicit pain (including acting as if in pain after football tackles). All had injuries to their lips (some requiring later plastic surgery) and/or tongue (with loss of the distal third in two cases), caused by biting themselves in the first 4 yr of life. All had frequent bruises and cuts, and most had suffered fractures or osteomyelitis [inflammation of bones or bone marrow, usually caused by infections], which were only diagnosed in retrospect because of painless limping or lack of use of a limb. (Cox, et al., 2006, p. 894)
The following video illustrates some of the basic concepts of operant conditioning that you learned above:
Study Questions for Section 4-16
- How are instrumental learning and operant conditioning related?
- Why did B. F. Skinner call the type of learning he studied “operant conditioning”?
- What is a “Skinner Box” and what is it used for?
- How would you define discriminative stimulus in your own words?
- How would you define operant response in your own words?
- How would you define reinforcement in your own words?
- How would you define punishment in your own words?
- What is being associated in operant conditioning? (NOTE: In your answer, please make use of the terms discriminative stimulus, operant response, and/or reinforcement/punishment.)
- How do you know when an association has formed in operant conditioning? (NOTE: In your answer, please make use of the terms discriminative stimulus, operant response, and/or reinforcement/punishment.)
- Using your answers to the previous two questions, as well as the discussion in Section 5-13, how would you describe the differences between operant conditioning and classical conditioning? (NOTE: This was not discussed above.)
- What are two examples of operant conditioning that you’ve experienced recently? (Note: In your examples, please label the discriminative stimulus, the operant response, and the reinforcement/punishment.)
- How do individual differences arise in the learning (or not) of operant responses?
- In what ways might the ability to be operantly conditioned be adaptive?
Bjork, D. W. (1993). B. F. Skinner: A life. New York: BasicBooks.
Brownlee, C. (2006). Feel no pain, for real: Mutation appears to underlie rare sensation disorder in a Pakistani family. Science News, 170 (25), 389.
Cox, J.J., Reimann, F., Nicholas, A. K., et al. (2006). An SCN9A channelopathy causes congenital inability to experience pain. Nature, 444, 894-898. doi:10.1038/nature05413
Schick, K. (1971). Operants. Journal of the Experimental Analysis of Behavior, 15, 413-23. doi: 10.1901/jeab.1971.15-413
Skinner, B. F. (1937). Two types of conditioned reflex: A reply to Konorski and Miller. Journal of General Psychology, 16, 272-79. Retrieved October 17, 2011, from http://www.psych.yorku.ca/classics/Skinner/ReplytoK/reply.htm
Skinner, B. F. (1953). Science and human behavior. New York: Macmillan. (A pdf of this book is available from the B. F. Skinner Foundation at http://www.bfskinner.org/BFSkinner/PDFBooksSHB.html)
Skinner, B. F. (1974). About behaviorism. New York: Knopf.