Dog Forum banner
Status
Not open for further replies.
1 - 15 of 15 Posts

·
Registered
Joined
·
3 Posts
Discussion Starter · #1 ·
Hi,
I have been training with clicker for a while, I understand the rules of the training and see it works.
I discussed it a while ago with a psychology professor and he raised an interesting question I have been trying to find answer for since. Maybe there's someone here who can help me.

Here it goes: the science behind a clicker is that we first classically condition the sound of a click with treat. This means that the neutral stimulus, that a click was, has been paired with the unconditional stimulus, the meat, and became the conditional stimulus. Now the reaction of the body for the conditional stimulus (sound of click) is the same as for unconditional stimulus (meat).

After we've conditioned the clicker, we can begin training new behavior using it. The first rule we learn is that we have to treat every time we click.

The question is, why should we do that if a click is already a conditional stimulus, which by definition gives the same response as unconditional stimulus? And to keep the body responding the same way, there is no need to reinforce every click (after it has already been conditioned), in fact the most persistent results are where you apply a variable schedule of reinforcement (for the click).

So, applying this reasoning, the scheme of a clicker training could be:

1. Get the new behavior
2. Click every time, as we are teaching new behavior and the click has the same 'meaning" to the animal as the treat.
3. Give a treat from time to time, to make the click remain the conditioned reinforcer.


If anybody understands why this would be a wrong scheme, please tell me!


I asked this question around and I get loads of answers like:
- the c/t rule is in every book, the writers and trainers know better
- the click is just a signal for the incoming treat
- we have a deal with the dog to treat after every click.

As much as I understand these arguments makes it easier for writers to explain how you should train with a clicker, and for most to get some understanding of this training, these are not scientific explanations I am looking for. Please don't waste your time writing these kind of answers, as I am looking for the answer that lie in biology and research in behavioral science. Thanks!!!
 

·
Registered
Joined
·
2,794 Posts
I know this isn't the answer you want, but my understanding is that the clicker isn't necessarily used in lieu of a treat to reinforce the behavior, rather, it is used to mark the behavior so the animal knows for what exactly it is being reinforced. There are some times when you can't stuff a treat in an animal's mouth right away, like when you are working from a distance, so the clicker is used so that the timing isn't off when you do reinforce with a treat. I know clicker training was popular with marine animals before it was with dogs, and you can't be right next to a dolphin to reinforce it right away when it is jumping in the middle of a tank. So it can be conditioned to a marker so that it knows that the fish you give it AFTER it jumps is for when it heard the click/whistle/marker during the jump and not for swimming to the side of the tank when you give it the fish.

I think markers are just as legitimate a psychological concept as Pavlovian conditioning. I'm not sure why you think otherwise? I believe they're similar, but not quite the same thing.

Interesting question though, I see why you're asking. If you really want research I can try to find some articles on conditioned markers vs conditioned stimuli after work.
 

·
Registered
Joined
·
3 Posts
Discussion Starter · #3 ·
Hi, thank you for your input!

I think you might have missed my point a bit. My intention was not to leave the clicking out; but to keep clicking the right behavior, but not giving treats every time after a click.

You mention "marker" as a psychological concept - from my knowledge, there is no "marker" definition in the behavioral psychology basis, it is a word that became useful for the clicker training as it makes the concept understandable.
What "marker" really stands for is a "conditioned stimulus" concept instead. If I am wrong, please direct me to any behavioral science books that refer to 'marker'.


The only article I found that directly adresses the subject is this one:
http://www.naturalencounters.com/documents/BlazingClickers.pdf
(specifically, the "misconception #1" section) but it goes to the subject of extinction, completely omitting the strength of variable reinforcement schedules. The explanation given does not feel convincing to me at all :(
 

·
Premium Member
Joined
·
4,599 Posts
I'm sure this doesn't really answer your question with any sort of science-based knowledge, but "clicker training" is different than "training with a clicker". (Clicker Training vs Training with a Clicker) In clicker training, we reinforce every time with something (usually food, but my BC loves toys as well).

Once my dog has the behaviour fairly well down, I do some sessions without the clicker and a variable treat schedule. Then, when proofing I go back to the clicker so I can be more instantaneous.

I know it is isn't scientific, but to me, the click is an agreement I've made with my dog. If I click, you get a treat. :p
 

·
Registered
Joined
·
1,037 Posts
The clicker is is a bridge, not a reward. It is a signal to the animal "what you did was correct, and your reward is incoming." It bridges the time between the behavior and the reward, because it's almost always very impractical if not impossible to give the reward at the instant the animal does the behavior you want. It's just communication.

So it comes to predict the reward rather than become conditioned as a reward, kwim? And there's nothing magical about the clicker itself, many people use a marker word instead ("yes!" is commonly used).

So I wouldn't click and not treat. What I do (and this is common) is fade the click and transition to a marker word that means "what you did was correct, but you're not getting a reward right now." I use "good!"

For your dog to not become reliant on rewards, and to be able to build chains of behaviors, it's important to fade the treats once a behavior is learned. But if you click and don't reward for your clicks you run the risk of the clicker becoming meaningless and losing a very important communication tool.
 

·
Registered
Joined
·
717 Posts
@gosgos, you might want to try and get your hands on Burch and Bailey's "How dogs learn". There is a whole chapter on reinforcement in that book. If you want the slightly less scientific version http://www.successjustclicks.com/weaning-off-of-treats/ The idea is to train the behaviour by rewarding the right response every time. Then when the command has been learned you change to an intermittent reinforcement schedule.
 

·
Registered
Joined
·
3 Posts
Discussion Starter · #7 ·
Ah, some nice thought for me to battle with :)

The clicker is is a bridge, not a reward.
I think this is the core of the discussion. Because my argument here is that, after conditioning, the click becomes the reward. Why do I say so? Because it makes the animal respond the same way as it responds to an actual reward, which is basically the point of classical conditioning.

Forget the clicker for a second and think about the famous "little Albert" experiment. The sight of rabbit (or rat, was that?) was paired with a terrible noise. Then the rabbit alone made the little Albert experience the same emotions (great fear, to the point where he cried in spasms...) he had when hearing the noise.

So to put it another way, once you conditioned a stimulus, it makes the body respond the same - and experience same emotions, as the unconditioned stimulus itself. For the click and treat that would mean that the click triggers some happy excitement etc. and - in that way - is rewarding. Everybody likes feeling happy, right?

If you only don't let it extinct, and reinforce it once in a while, it should keep working this way...

And btw, where do I find some definition what a "bridge" is if we tried to stick to that argument?


it's important to fade the treats once a behavior is learned.
Let me try to explain one more time. I know that learning new behavior requires constant reinforcing. And once the behavior is learned, it is convenient to move to variable reinforcement schedule. That's clear.

And if it is so, and we teach the animal that a click means a treat, then his response to a click is learned - and he does not need a constant reinforcement for the click - this is the point.


I know what I write might look like against the 'common sense', but I'm actually trying to forget about my common sense and get to the core of what happens when you clicker-train, just out of the sheer curiosity!
 

·
Registered
Joined
·
316 Posts
This is a fairly common question, I think, although not one every training book has the time to get into. But as I understand it, here's the basic theory.

A click becomes a conditioned stimulus through many, many repetitions of conditioning trials. But it never becomes a primary reinforcer (AKA unconditioned stimulus). Sure, a secondary reinforcer can become as powerful (or even more powerful) as a primary reinforcer through a conditioning history...but there's no way to know precisely when that happens, or how long that conditioned response will last without continued association. It is a learned response, and animals are always learning.

Every time a conditioned stimulus occurs, there is a learning event. If the stimulus is followed by an unconditioned stimulus, that's a classical conditioning trial (pairing of neutral/conditioned stimulus and unconditioned stimulus). If the CS is not followed by an unconditioned reinforcer, that's a respondent extinction trial (un-pairing of CS & US). If you do enough respondent extinction trials, you can return a CS to a neutral stimulus (producing no response).

Repeated classical conditioning trials are how we build a conditioned association, turning that "click" into a secondary reinforcer or conditioned stimulus. Repeated respondent extinction trials are how we dissolve that association, weakening and then erasing the conditioned response. There is no set number of trials required in either case...it will always vary, depending on many factors. So if we are randomly pairing and un-pairing an association between two stimuli, there is no way to know precisely how much we are weakening the association...but we can assume that we are probably weakening it to some degree. The weaker the association between our click and the primary reinforcer that follows, the more our dogs (or other animals) will look for other stimuli to try to locate the behavior-consequence contingency. This means we are communicating less effectively with our dogs...they eventually learn to ignore the clicker, and look to more salient clues as to whether the primary reinforcer is arriving (like reaching toward your treat pocket: an act which can become a secondary reinforcer but is not quite as handy a training tool as a clicker).

Instead of weakening the association between a click and a primary reinforcer, most behavioral scientists suggest putting clicks on a variable reinforcement schedule that suits your particular training goals (depending on the schedule you choose, you can shape things like persistence, or precision, or whatever). That way, a click remains an effective and clear way to communicate with your dog, without limiting the way you use intermittent reinforcement to continue to shape behavior.

Hope that helps!
 

·
Registered
Joined
·
1,037 Posts
A click becomes a conditioned stimulus through many, many repetitions of conditioning trials. But it never becomes a primary reinforcer (AKA unconditioned stimulus). Sure, a secondary reinforcer can become as powerful (or even more powerful) as a primary reinforcer through a conditioning history...but there's no way to know precisely when that happens, or how long that conditioned response will last without continued association. It is a learned response, and animals are always learning.
And honestly I don't see a benefit to transitioning the clicker to a primary reinforcer.

When I am training, I want to communicate to my dog with as much clarity as possible. I want my communication to be communication and my reward to be a reward.
 

·
Registered
Joined
·
1,037 Posts
As far as information about bridges, really any book or discussion about using markers in training at all will discuss it.

As far as I'm aware there aren't specifically scientific studies comparing outcomes of keeping a clicker as a bridge only vs conditioning it to be a secondary reinforcer. Probably like with many training techniques, it varies with the trainer's skill and experience, dog's learning style, etc.

But for me personally, as I said I don't see the value in using it as communication AND secondary reinforcer although I think you're correct that it would be possible to do so.

I'm using it as communication and IME lumping rather than separating tools tends to muddy communication.
 

·
Premium Member
Joined
·
11,876 Posts
So my understanding is that a variable reinforcement schedules are to be applied to behaviors. Specifically those on cue and under stimulus control. The click is a conditioned marker/bridge. Not a behavior. If wanting to transition to a variable reinforcement schedule for example for sits, one would do so by click/treating some but not all of the sits. Not by clicking all sits and treating some...

Also as far as the click being a conditioned marker/secondary reinforcer... I agree it is. But just as we can classically condition the click with a pleasurable stimuli - food, we can also condition markers with unpleasant stimuli... for some animals the lack of a treat (especially once the marker is conditioned) could potentially be an unpleasant stimuli (Anecdotal but I have seen dogs become very frustrated by clicks without a treat). So could potentially be changing the association made with the click by not feeding... I'm not really sure... just my rambling thoughts on the matter based on what I know and have observed...
 

·
Registered
Joined
·
316 Posts
As far as the marker/bridge versus secondary reinforcer conversation, here's no way to effectively use a clicker as an event marker (or "bridge") without also turning it into conditioned reinforcement ("secondary reinforcer" just means conditioned reinforcer).

There are things that provoke an inherent, automatic emotional response in a dog. When the inherent emotional response is pleasurable, we call those things "primary reinforcers." There are things that provoke little-or-no automatic response, but that a dog learns are predictive of something pleasurable. We call those things "secondary reinforcers." It's not a value judgment, it's just a description of whether the dog's emotional response to a stimuli is inherent or learned.

A click typically produces very little response on its own. So we cannot describe it as primary reinforcement: it does not inherently cause a joyful reaction in a dog. However, repeated learning trials can teach a dog that a "click" predicts a treat. As a dog learns this, she begins to have a conditioned emotional response to the click sound: the click begins to cause a joyful reaction in the dog, because of what she has learned to expect will happen next. Thus, a click is a form of secondary reinforcement.

A clicker is also an event marker: a signal used to mark behavior the instant it occurs. Marking the behavior is only useful if the dog has an emotional response to the marker (if the dog is indifferent to your marker, why would she care that you are marking something?). So an event marker like a clicker is also secondary reinforcement, regardless of whether we think of it that way.

People call markers "bridges" because event markers are bridging stimuli: they mark the behavior the instant it occurs, and signal to the animal that primary reinforcement will be arriving shortly, so that the gap between behavior and reinforcement is "bridged." This is useful because it helps clarify the behavioral contingency for the dog, making it more clear precisely what behavior caused the consequence. But there's no meaningful way to say that we're only using a clicker as an event marker, or only using a clicker as a bridging stimulus, or only using the clicker as secondary reinforcement...learning is continuous, and all these things are happening at once.
 

·
Registered
Joined
·
1,037 Posts
Well I can say I am using the clicker as an event marker even if there are other things going on.

If I intentionally stop rewarding every click, then I am now using the clicker as a secondary reinforcer instead of an event marker. I personally don't care to do that.
 

·
Registered
Joined
·
717 Posts
If you want to go into the psychology of clicker training, try reading Karen Pryors' "Reaching the animal mind". It explains, or at least tries to explain the "science behind it.
 
1 - 15 of 15 Posts
Status
Not open for further replies.
Top