resources for applied ethology

 

contact us

links

           

 

   

Learning Theory

Imprinting - Non-associative learning - Classical conditioning - Operant conditioning

Extinction - Positive reinforcement - Negative reinforcement - Punishment - Shaping

 

 

Positive reinforcement

 

 

Reinforcement can be positive or negative. Both types of reinforcement make a response more likely in the future. Reinforcement schedules can be constant or intermittent and intermittent reinforcement can be delivered at either invariable or variable rates.

 

Continuous reinforcement

 

A reward is delivered after each response. Animals learn fastest when they are reinforced after every correct response. This is known as a continuous reinforcement schedule.

 

Fixed Ratio (represented by the initials FR)

 

Associations can still be established with less regular rewards or higher schedules of reinforcement. This is hardly surprising since wild animals may need to perform a given behaviour several times to acquire food. It is this persistence that trainers rely upon when they have established an association and wean their animals off continuous reinforcement. A reward is delivered after a fixed number of responses. A common code used in learning protocols is FR5, which means the reward is delivered immediately after the fifth response. On a fixed reinforcement schedule the animal often learns to predict the pattern of food delivery and therefore appears to lose interest by slowing down its response to stimuli after each reward while beginning to pay keen attention only as it gets near to the end of the fixed ratio (in this case, the fifth iteration of the response). To avoid these peaks and troughs in responsiveness, variable ratio schedules can be used.

 

Variable Ratio (represented by the initials VR)

 

A reward can be delivered after a random, variable number of responses. For example VR5 means although the rewards are sometimes delivered after ten, sometimes after twenty and sometimes after one response and so on, the average number of responses required for a reward is five. After a response has been established (this being achieved quickest on a continuous reinforcement schedule) many trainers adopt a variable ratio schedule in the knowledge that it is often very difficult to reward responses every time they occur especially if they form part of public displays, competitions or if they have to occur at some distance from the trainer. Trained behaviours learned on a variable reinforcement schedule are the most persistent and they are slower to extinguish than those resulting from fixed and continuous schedules. This is because during training on a variable ratio, many responses may have had no consequences and persistence is more likely to be rewarded. Dogs that sometimes get titbits for begging at tables take longer to give up when owners learnt to never reward the behaviour than those that have had constant reinforcement.

 

Fixed Interval (FI)

 

A reward is delivered for the first response that occurs after a fixed interval of time has passed since the last reward. For example FI5 means the reward is delivered for the first response after five seconds has passed since the last reward.

 

Variable Interval (VI)

 

A reward is delivered for the first response after a time interval since the last reward. The interval varies on a random basis but averages out to a particular value. VI5 means that these time intervals average out at five seconds and would range between zero and ten seconds.

 

There is one other schedule worthy of mention, the differential reinforcement of other behaviours. This is a schedule in which a trainer chooses one behaviour that will not be reinforced. Instead the trainer reinforces a variety of other behaviours. Predictably, this approach causes the non-reinforced behaviour to drop out and is often used to change problem behaviours. While this schedule withholds reinforcement of the problem behaviour it still allows reinforcement to be delivered. Withdrawing reinforcement completely may not always be advisable, as there is a danger of removing all incentives to respond in any way. Just as it is important to avoid confusion and promote creativity when training a new behaviour, it is imperative that when training an animal to stop performing a problem behaviour, it is simultaneously given the opportunity to perform a more acceptable behaviour with a similar motivation. A dog that chases joggers can most easily be trained to stop and look at the handler if it associates the sight of a jogger with an owner-centred ball game into which it can channel its motivation to chase.     

 

Partial reinforcement effect

 

The term partial reinforcement effect refers to both the increase in performance under partial reinforcement schedules and the increased resistance to extinction of responses that these produce when compared with continuous reinforcement. Responses can be made highly resistant to extinction by training up to very high partial reinforcement schedules. Conversely, if trainers want to extinguish a response, they do well to start by determining the schedule of reinforcement used to establish the response.

 

back to Learning Theory

next section

 
back to top  
 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 Rat eating a chocolate reward (NB Chocolate is not the best choice of reward for rats because it can cause obesity)

 © OLIVER Image Library:

contributor P. McGreevy

 

 

 

 

 

 

 

           
 

 

© animalbehaviour.net

animal animals behaviour behavior pets horse horses dog dogs cat cats animalbehavior animalbehaviour children kids problem problems behavioural behavioral learning abnormal normal Paul McGreevy