I have received much criticism of my atypical approach to using the clicker from beginners to experts alike. Many have noticed that I don't always give treats after using the clicker and that I make clicks while training two parrots simultaneously. I'd like to take a little time to explain how and why I am doing this and the impact it has on parrot training.
First of all, let's go over the typical approach to using a clicker as a bridge. At the moment the parrot does the right thing, a click is issued by the trainer using a clicker. Then at the trainers soonest convenience, a treat is given to the parrot. In other words, the clicker is a promise to give a treat as reward for the behavior being performed at the moment of the click. This is a highly effective techniques for capturing and shaping behaviors in training. Using the clicker can consistently and precisely mark the desired behavior so that the parrot can catch on and repeat it more readily.
I have used and do recommend the standard method of clicker training described above. For the vast majority of parrot owners, trainers, and performers, this may be the optimal approach. However, I have taken the clicker a step further and would like to present my method for those parrot owners and trainers that want to achieve even greater success with clicker training. The fundamental prerequisite is 6-18 months of consistent and successful clicker training using the standard method. The parrot should have already learned a bunch of different tricks and be reliable at demonstrating them. Attempting my special approach with an inadequately trained parrot will surely ruin the clicker and confuse the bird so I do not recommend this approach for most people. Only put this into effect if you have had extensive success training your parrot and want to take it one step further.
My clicker approach is made up of two parts. First is transforming the clicker from a bridge to a secondary reinforcer and the second is to use it in this way with multiple parrots simultaneously. Both of these parts require extensive successful clicker training of one bird at a time. Thereafter, either one or both of these can be applied although I would put off training two parrots simultaneously to the last. If you don't anticipate to move away from one click means one treat, you can skip to clicker training two parrots together.
The main reason I moved away from one click means one treat was because I wanted to train Kili to perform many different tricks but couldn't give her treats for everything or she would get too full. Thus I employed a variable ratio reinforcement schedule when it comes to treats. What this means is that the parrot has to complete the right behavior every time it is asked but only receives a treat some of the time at a random trial. However, one problem with doing this is that if the parrot botches one trick in the process, giving or not giving treats does not provide reliable performance feedback. With classic clicker training, not receiving a treat and likewise not receiving a click mark failure in regards to the bird's behavior. Since treats are necessary for continued motivation but providing them randomly provides poor feedback, I decided to use the clicker every time the right behavior is offered but provide food on a variable interval. Thus the clicker is used a continuous secondary reinforcer while the treats are provided on a variable ratio reinforcement schedule. This works out as a perfect blend of feedback and motivation with minimal satiation and maximum success/improvement.
In this way I can have my parrot run through 10 tricks in a row, click for the 9 correct times, not click for the 1 wrong time, and provide just a single treat at a random point (but only following a correct attempt). The parrot is still told that the 9 attempts were correct and could have earned a treat, 1 attempt was wrong and should not be done that way, and motivation was maintained the entire time. Furthermore, 10 treats could be used to elicit as many as 100 iterations (and thus 100 practices of performing the right tricks the right way at the right time) instead of just 10. This is how my special clicker approach is successful and goes well beyond the classic one click one treat approach. By having 110 trick attempts, 100 correct/successful ones, and 10 incorrect unclicked ones, he parrot has 10 opportunities to learn what not to do and 100 chances to learn what to do for the same number of treats that would have only provided 10 opportunities for learning. This allows my parrots to practice more behaviors, exercise more flight, and be overall more reliable than with the standard clicker approach.
Since the clicker has been so closely associated with food from the beginning, doing things to hear clicks can become desirable and thus a conditioned reinforcer of its own. Since good things tend to happen around clicks but don't have to, the parrots are still more inclined to demonstrate clicker-worthy behavior. This is also a great way to retain motivation through very high ratio variable reinforcement. For example, if I am going to make Kili fly 20 recalls to earn a single treat, as long as she keeps getting clicks, she knows it is worthwhile to keep trying and not give up. She knows from past training that as long as she keeps getting clicks, there will be a treat offered at some point. Since there is no other way to get that treat except to keep trying, that's the course she has to take to earn it.
Keep in mind that I only use this approach while I am sustaining tricks through practice. I do revert to the more effective continuous reinforcement strategy of one click one treat when teaching a fresh new trick. Once the parrot is well accustomed, I add that trick to my list of tricks to practice using variable reinforcement.
There are times when I chain behaviors either out of convenience or because it is a trick that requires multiple components. This is another great time to employ my click for correct behavior rather than treat for every correct behavior approach. Many times when I am training tricks to my parrots, I continue having them fly recalls to me from across the room for exercise. I used to feel bad when I would divert treats away from flight recall (which is valuable exercise) and use them for trick training instead. Lately, I've come up with a much better approach where I make my parrots first fly a long recall (or several) to me just to get the opportunity to practice a new trick to earn a treat.
After years of training, both of my parrots understand very well that new tricks earn treats every time while old behaviors only some of the time (although they are easier so they love to perform them). For this reason, they are very eager to give me some flight recalls for the chance to get a guaranteed treat for learning a new trick. Plus it's simply more fun that way.
Now when it comes to chaining tricks to form a long sequence, the clicker can apply in the same way. Let's take Kili's famous stroller trick (which was performed on the Late Show with David Letterman) as an example. Clearly the complete sequence is comprised of several independent tricks that she must perform in order. First she must pickup her baby, then she must patiently hold it for demonstration, then she must take it over to her stroller (and not the bed) and place it in, then she must walk around the stroller and start pushing it, then she must stop pushing and walk around, then transfer her baby from the stroller to the crib, rock the crib, and then finally wave goodnight to baby. How do you teach such a long chain to a parrot without stopping every couple of seconds to wait for it to eat a treat? This is where the click for every correct behavior but only a treat at a random time approach proves such a success! Obviously I taught Kili the separate tricks that combine into the sequence separately, but when I was finally teaching the complete sequence, I used this exact clicker approach. A problem that I was running into was her eagerness to skip steps to jump to the end and get the one final treat for finishing the sequence. For this reason I went back to the click every correct behavior and offer a random treat to ensure that all steps in the sequence are equally rewarding. After she got really good at the trick, I returned to clicking along the way (to remind her that she is doing things right by not skipping to the end) and only giving one treat at the end. Since she won't get a treat at the end of she misses a click along the way, she learned to patiently go through the entire routine.
The final non-standard complex use of the clicker I employ is teaching two parrots simultaneously while using just one clicker. I sneaky (but too annoying) approach could be to have two different sound makers where one is for each parrot and they know their sound. I differentiate who is earning clicks through attention and eye contact. Even though I say I train the parrots together, it's not actually in the exact same moment. Normally I'll have one bird stay on its training perch while I have the other fly over to me to learn something. The parrot near me knows it is earning the clicks and not the one far away. If I have the two birds on perches next to each other, they know when I am clicking for them because I am looking at them at the time of the click. Sometimes I have them perform the same tricks at the same time. In this case I am looking in a blank way toward both of them. They are exceptionally intelligent and catch onto all of these subtleties. The important thing is that I am consistent in these methods so the specifics they learned apply each time.
Although it might seem that mixing the clicker in the ways I do would be confusing or dilute its effectiveness, this couldn't be further from the truth in reality. Parrots are so highly intelligent and catch on to things very quickly. They learn the multi-dimensional complex of the clicker based on the context they observe. It's like we can hear the sound “toooo” and still be able to understand whether we are talking about “to”, “two”, or “too”. Since my mixed clicker strategy has not resulted in a diminish in clicker effectiveness (and in fact improved it), I am certain that parrots too can learn to understand things in context.
So that is my special mixed method of parrot clicker training. Although I would not recommend anything but the one click-one-treat approach to most people, I think this article should help clarify what I do and why. Also for the select few who have taught many tricks and wish to take their training to a new level, I share my approach. Whatever clicker approach you use, as long as it is effective, the parrot is learning, and you are both having fun in the process, it is already a major success.
Thank you for writing about how you use the clicker as a secondary reinforcer. Before I found your videos and your blog I had read two books on clicker training birds and they both said you should fade the clicker after the trick has been learned. At some point early last year I asked you why you were still using it on established tricks and you said it was because you had made it a secondary reinforcer. That inspired me to want to use it that way in my birds.
After months of using a one click to one treat ratio and after they each knew several tricks I started chaining behaviors together and rewarding on a variable ratio. I would click for correct behavior but would only deliver a treat every 3 - 6 clicks.
I don't think I've "ruined" the clicker with my birds at all using this method. The reason why is because when I use it to teach new tricks (rewarding each time while they are learning) my birds still catch on very quickly so the clicker obviously still has meaning. Like you I believe the clicker, when used your way, helps to motivate the bird when you are using a variable ratio of reinforcement and lets the bird know that at some point it will get a treat because it is doing the right things.
I have observed a behavior in my Poi that helps support the idea that the effectiveness of the clicker has not been ruined but rather has actually been improved. When I cue a trick, if he gets a click, he will wait to see if I'm either going to cue another trick or deliver a treat. If he does not[/i:3pf8ylp5] get a click (because, for example, he did a sub-par wing stretch when I cued "wings") he will often immediately repeat the cued behavior before I even have a chance to cue it again. If I don't cue a trick he doesn't get a reward; however, his eagerness to offer the trick again so quickly if he doesn't[/i:3pf8ylp5] hear a click as opposed to just waiting really makes me believe the clicker is working very well as a secondary reinforcer.
I also train two birds at the same time with no problem. It has not created confusion with them or ruined the clicker in any way.
I'm thrilled that others have been able to make use of my special technique. I didn't exactly sit there and come up with it. It just sort of happened along the way as I was training Kili. No one else does it this way and gets criticized a lot but it sure as heck works. I'm glad it works equally effectively for others as well. My biggest use for it is for extensive VR exercise recall flights.
Some people complain that using a clicker is too much work or fuss and they want to phase it out. But for me, it's just the easiest thing. If the bird will perform 9 extra tricks or fly a few extra times just for a click (instead of filling up on treats or me having to do anything drastic), that's some really effective and easy parrot training right there! I do try to give extra praise and attention when no treats are offered... but I'm sure on the 50th flight recall or trick practice that attention isn't worth much either. Which is why I rely so much on the clicker.
I didn't mention it in the article, but it's good to practice stuff with your parrot without a clicker on occasion as well. Typically this happens inadvertently when you forgot to use it, or you have company over, or you're outside. But if it doesn't doing some stuff a few times a month without a clicker is a great additional bit of training. This way when you do want to show off for friends or don't have one handy, your bird will still respond. However, for the monotonous day to day training where you just need the bird to practice their stuff and exercise many times, my clicker strategy works best!
Thank you for another well-considered article, Michael!
To me, the approach of secondary reinforcer and variable rewards seems quite intuitive. I've used this approach with my budgie at times and it works a treat. It's certainly not changed the effectiveness of the clicker at all. I use it for repetitive iterations of a trick - flight recalls and targeting (variable), putting multiple items in a container etc.
Karen Pryor's FAQ on clicker training says:
Do clickers and treats need to be used for every behavior, forever?
No. Once a behavior is learned and on cue, there’s usually no need to click, as the animal understands the behavior. Clicker trainers can maintain the behavior by replacing specially good treats with occasional and less intensive rewards including a pat or praise. Learned cues and behaviors are also maintained by real-life rewards: for example sitting quietly at the door is rewarded by opening the door so that the dog can have a walk. Clicker trainers then save clicks and treats for the next new thing they want to train.
Even when I don't explicitly reward, the click and the attention that goes with it must let Wiki feel rewarded enough to continue, and that's the purpose of the reward. I think our training buddies enjoy the interaction and attention and challenge, and that's partly their "real-life reward". We practice without a clicker on occasion, too - it's how I know whether he really enjoys doing something or not!
I think we shouldn't underestimate the capacity for understanding in our birds, and their ability to give a more subtle meaning to the clicker than just a 1:1 relationship of "click = food". There are many "old school" trainers who work with multiple performing birds and do not use clickers or even rewards when performing (yes, I even get given a hard time by them for "needing" a clicker in the first place), so I don't think we should be too precious about what is in essence a communication tool.
that's interesting, because as a beginner in training after reading many many books, as well as Steve Martin's and Susan Friedman's theories on their respective websites, I did get convinced that one-click / one-treat ratio is the best way to train.... according to their teachings...
I am a bit confused right now
There's nothing to be confused about. If you're a beginner or haven't had success with extensive clicker training, stick to the one click one treat method. This article is about taking things a step further for people who've made a lot of progress and want to take things even further.
Um, Micheal I believe you have that wrong. Steve Martin and Dr. Friedman don't advocate this for a reason. It's not that the clicker doesn't become a reinforcer and they certainly encourage to use VR when a behavior is well established. This is because we know that if Continuous Reinforcement (CR) is always used behavior breaks down!
So my question to you is "Are you really using VR reinforcement when clicking every time and only providing treats some of the time?"
I want you to really think about what you are doing when the behavior occurs and tell me why you think this is VR.
[quote="KaratParrot":xrqanz23] This is because we know that if Continuous Reinforcement (CR) is always used behavior breaks down![/quote:xrqanz23]
Are you sure? Because I think the most experienced animal trainers in the world, the Baileys, would [url=http://www.clickersolutions.com/articles/2001/ratios.htm:xrqanz23]disagree with you.[/url:xrqanz23]
Hey rebcart! I'm glad you brought that up!
The Bailys are great trainers that's for sure, but they may be forgetting a little something. In that link they discuss seeing behavior of their animals breaking down when they switch from a CFR (my mistake labeling it CR! Not to be confused with Conditioned Response!) to a VR which is one reason given for not using VR. But this is well explained by Russell A. Powell, Diane G. Symbaluk, Suzanne E. MacDonald, and P. Lynne Honey in Introduction to Learning and Behavior p. 261. If the trainer expects the animal to make too many responses, without moving at the animal's pace, it is known as ratio strain[/b:2faclqf6]. And this could well be what has happened.
Behaviors break down on CRF if the Establishing Operations (EO) are not changed to make the reinforcement (in this case food) more valuable. For the classic slot machine example, what would happen if a small amount of money always came out? Not as many people would play, it's not fun! And when they do play it stops when saitation hits. But what is interesting to note is what would likely happen due to different EOs. People with less money play longer and people with more money play shorter. This goes the same with birds. Hungrier birds will still train more using CRF than satiated birds. But now the bird does tricks because he has to, not necessarily because it's fun.
Yes I am sure behavior breaks down on a CRF. People forget to consider the EOs when saying CRF works to maintain behavior.
I took a look at the article and see where it is coming from about VR not being necessary for the basic dog owner teaching tricks to their dog. There is a substantial difference when it comes to parrots that makes me disagree that VR isn't necessary for most pet owners. Parrots are naturally wild animals so except for the positive reinforcement taming and training that we give them, they are difficult to handle. In order to get our parrots to step up reliably, come out of the cage, flight recall to us, not bite us, etc we cannot keep giving treats every single time or the parrot will be stuffed with treats in the first 5 minutes it spends with us. The more good behavior that we can move to a diluted variable ratio reinforcement schedule, the more domesticated the parrot's behavior appears to be.
As for the clicker, yes I use the clicker on a continuous reinforcement schedule. I also give praise and attention on a pretty continuous reinforcement schedule. These are mild reinforcers and mostly secondary. The treats - which are the predominant thing the parrot is working for - are on a variable ratio reinforecement schedule. Thus I am successfully playing the best of both worlds where I can differentiate good from bad demonstration of behavior in a long chain or sequence yet get the parrot to complete all of that behavior for infrequent food rewards.
For example I can execute the following sequence with the following outcomes:
turn around/incomplete/no click
wings/good/click & treat
In this manner the parrot learns success from failure throughout the sequence of behaviors (note they are not a chain because they are performed in random order and not repeated this way again) and continues trying for the sake of earning the more desirable food rewards on that variable ratio schedule. I have found this approach to be the most effective way of getting the benefits of continuous and variable ratio reinforcement schedules.
A lot of "domestication" training for our parrots ends up inadvertently being reinforced on a variable ratio. For example stepping up is occasionally rewarded with a chance to do tricks and earn treats. Other times it does nothing and other times it gets a meal and other times it gets a pesky grooming. Since it is generally rewarding but on a dilute VR, it ends up being very successful in the long run.
My approach is one I happened upon myself and not something I ever saw recommended anywhere. I am sharing it based on my own tremendous success with it. I have found the ability to train more behavior, more effectively, and maintain longer motivation using this technique than CFR or VR alone. As a side effect it has also taught my parrot to try harder when training new behavior. Sometimes the behavior I am soliciting in training something new just isn't click-worthy for a while. Since my parrot is used to clicks confirming all correct behaviors, it makes her realize she is doing things wrong and makes her keep trying to get it right. Better yet, since she is so accustomed to training in this manner, she even varies the behaviors she offers until it's right enough to earn a click. It may be a more difficult concept to achieve off the bat but it's been working phenomenally in the long run.
[quote="Michael"thw1bb7]I took a look at the article and see where it is coming from about VR not being necessary for the basic dog owner teaching tricks to their dog. There is a substantial difference when it comes to parrots that makes me disagree that VR isn't necessary for most pet owners. Parrots are naturally wild animals so except for the positive reinforcement taming and training that we give them, they are difficult to handle. In order to get our parrots to step up reliably, come out of the cage, flight recall to us, not bite us, etc we cannot keep giving treats every single time or the parrot will be stuffed with treats in the first 5 minutes it spends with us. The more good behavior that we can move to a diluted variable ratio reinforcement schedule, the more domesticated the parrot's behavior appears to be.[/quotethw1bb7]
This is a good point, training a wild animal (such as a parrot) is very different from training a domesticated animal such as a dog. A dog is more likely to want to please its owner whereas a parrot is not motivated to "people please" and unless it is hungry enough to be motivated by treats it loses interest.
You definitely get a lot more out of a parrot when rewarding on a VR. For example, if I am doing a session with my GCC and he gets full, he just flies away. I try to anticipate this and end the session before[/ithw1bb7] he loses interest but sometimes I miscalculate and he just flies off. You're not really going to have that problem with a dog since it doesn't have wings, but good luck chasing down a flighted bird and trying to get it to train once it's full.
[quote="Michael"thw1bb7]As a side effect it has also taught my parrot to try harder when training new behavior. Sometimes the behavior I am soliciting in training something new just isn't click-worthy for a while. Since my parrot is used to clicks confirming all correct behaviors, it makes her realize she is doing things wrong and makes her keep trying to get it right. Better yet, since she is so accustomed to training in this manner, she even varies the behaviors she offers until it's right enough to earn a click.[/quotethw1bb7]
I've experienced something like this when training my Poi. When I was training "wings" I used capturing so at first it was just him lifting his wings at the shoulder, folded. I wanted him to extend his wings but after a month of working on it I wasn't seeing any progress whatsoever. I decided I'd withhold the click until I got even a tiny[/ithw1bb7] amount more. When he didn't get a click after a few tries he varied the behavior and really surprised me with a [url=http://www.youtube.com/watch?v=G9lzmUM3uyothw1bb7]complete lateral wing stretch[/urlthw1bb7]. I was only looking for a tiny stretch so that was far more than I was expecting! I clicked and made a big deal out of it and offered him a jackpot reward. When I cued "eagle" again he initially offered the folded shoulder lift but when that didn't get a click after he tried it twice, he fully stretched his wings out laterally again on the third try. The next time I cued eagle he immediately did the lateral stretch.
I know you could get the similar results in shaping with a bird that wasn't used to the clicker as CFR and food as VR but I don't know if you could get them as quickly, nor do I think the bird would vary its behavior so readily.
I took a look at the article and see where it is coming from about VR not being necessary for the basic dog owner teaching tricks to their dog. There is a substantial difference when it comes to parrots that makes me disagree that VR isn't necessary for most pet owners. Parrots are naturally wild animals so except for the positive reinforcement taming and training that we give them, they are difficult to handle.[/quote:5j08a2ps]
I think that's where some confusion comes in Micheal. The Baily's are trying to say new[/i:5j08a2ps] dog trainers don't need to go as far as VR. This makes sense because experienced trainers and dogs certainly do need VR for complex behaviors when being a service animal, show dog or police dog.[/b:5j08a2ps] New and hobbyist parrot owners are the same way. It's about the trainer and their skill level, not how "wild" an animal species is. New owners of parrots tend to still use CRF, just like dog owners.
As for the clicker, yes I use the clicker on a continuous reinforcement schedule.[/quote:5j08a2ps]
Yes! When when a click is given for every response it is considered CRF (continuous reinforcement). Yet a CRF schedule is only used when we want to teach a new[/i:5j08a2ps] behavior as stated by every Behavior Modification teacher and researcher out there (just one example is Raymond G. Miltenburger). Can you think why that is?
Just for the animal training world I can think of a few good reasons for practical applications. Clicks for every behavior is time consuming and can be seen as annoying when presented onstage. It certainly isn't a necessary part of performances, even with shows with parrots.
I also give praise and attention on a pretty continuous reinforcement schedule. These are mild reinforcers and mostly secondary.[/quote:5j08a2ps]
Many trainers get confused with primary and secondary reinforcement. They often state that primary reinforcers are stronger than secondary. This of course is untrue. The definition of Primary Reinforcement is anything a species need to stay alive; food, shelter, air, water, sex, even companionship depending on the species. Steve Martin and Dr. Friedman go as far to even say that the ability to manipulate ones environment is a primary reinforcer[/b:5j08a2ps], also known as choice. Secondary Reinforcers are anything that an individual from a species finds reinforcing from a continuous pairing with the aforementioned primary reinforcer. It hasn't been often that this distinction has been made clear to the audience of the parrot forum and should be made more often.
The treats - which are the predominant thing the parrot is working for - are on a variable ratio reinforcement schedule. Thus I am successfully playing the best of both worlds where I can differentiate good from bad demonstration of behavior in a long chain or sequence yet get the parrot to complete all of that behavior for infrequent food rewards.[/quote:5j08a2ps]
This is the important part.[/b:5j08a2ps] Variable Reinforcement and Continuous Reinforcement cannot be mixed[/i:5j08a2ps]. This is because the entire schedule of reinforcement becomes CRF! It's just like you said, sometimes they get a click (as a secondary reinforcer) and sometimes they get food (as a primary reinforcer). This is still CRF, only the type of reinforcement is changing, which adds variety of course! It's always good to add excitement by adding variety.
So why do the teachers and researchers of Behavior Modification only recommend to use CRF for only teaching new behaviors?
Doesn't the bird get confused by using primary and secondary reinforcements in variable sequences and ratios?
From Michael's example:
[quote="Michael":3vao1617]For example I can execute the following sequence with the following outcomes:
turn around/incomplete/no click
wings/good/click & treat
In this manner the parrot learns success from failure throughout the sequence of behaviors (note they are not a chain because they are performed in random order and not repeated this way again) and continues trying for the sake of earning the more desirable food rewards on that variable ratio schedule. [/quote:3vao1617]
it appears to me that the bird might get confused about the tricks that are incomplete and not successfully executed... wouldn't it gradually perform THAT single trick worse and worse, because it would become unclear to him in the sequence (as he gets his food reward at the end anyway, and as he gets clicks for the other successfull tricks anyway) which trick he missed and why?
I do agree with the fact that they are intelligent animal.
However, how precisely do you think they can discern from a multitude of tricks the one they misperformed, and to what extend they screwed up, and to what level to perform better NEXT time ?
Also, wouldn't you think that this way, gradually with time, the bird will start misperforming more and more tricks out of the sequence ?
Unless in such a case, you would go back and re-train / practice that one trick only, with a one click / one reward technique, until the bird gets it systematically right every time.
[quote="Pralina":xqeyudme]Doesn't the bird get confused by using primary and secondary reinforcements in variable sequences and ratios?[/quote:xqeyudme]
[quote="Pralina":xqeyudme]it appears to me that the bird might get confused about the tricks that are incomplete and not successfully executed... wouldn't it gradually perform THAT single trick worse and worse, because it would become unclear to him in the sequence (as he gets his food reward at the end anyway, and as he gets clicks for the other successfull tricks anyway) which trick he missed and why?[/quote:xqeyudme]
No. The parrot realizes that the poor instance of the trick didn't even earn a click but the better one did. It still strives for repeating the better variants because the variants that always receive clicks, occasionally receive treats. The variants that never receive clicks, never receive treats. Therefore it is always worthwhile to demonstrate click-worthy behavior. I'm not making this up. Take a look at the videos demonstrating variable ratio reinforcement behavior. In reality it works even better than in the videos. Those are condensed and simplified just to make it easier to share. I could probably get Kili to run through her entire 20 tricks routine now for just a single treat. Since she gets clicks throughout the process, she both realizes she is doing things right and is encouraged to keep trying. If you try to get the animal to perform 10-20 behaviors without offering any feedback, there is a greater tendency to just figure it is doing it wrong and give up. Whereas if the clicks keep coming and the parrot is certain that some clicks must eventually earn treats, then it is worthwhile to continue and keep trying.
[quote="Pralina":xqeyudme]However, how precisely do you think they can discern from a multitude of tricks the one they misperformed, and to what extend they screwed up, and to what level to perform better NEXT time ?[/quote:xqeyudme]
Very precisely. However, not in the scope of "next time." This works extremely well in the long term. Obviously the trick must be carefully and precisely taught first to a satisfactory standard. However, some improvement could always be made. It's always better to achieve longer, further, higher, stronger, better. These finishing touches often take months or years to achieve and are not in the scope of the initial training. By using my approach, it continues to stimulate long term improvement while soliciting the greatest number of trials for practice. The behavior becomes extremely robust and resistant to extinction. Because my parrots get to practice the tricks thousands of times, they can't forget them. I can pull tricks we haven't done in a year and they know immediately what to do on the first shot and do it well. This is because of the endless practice we have underwent. But how can you practice a routine of over 30 tricks (plus a lot of flight recall for exercise) on a continuous reinforcement schedule!? That would require 30 treats just to practice each trick once!!!! That's ridiculous! Yet when I get 10 tricks for the price of 1 treat, I can make 10 treats and 100 clicks reward the practice (and exercise, yes exercise, lifting feet, rolling over, etc takes exercise to be able to do!) and gives more chances to learn if it is being done right or not.
Actually come to think of it, NOT using the clicker every time (when applying VR) would lead to diminishing performance. THEN the parrot tries to get away with a sloppy job and since treats come on a VR anyway, it still feels like it got occasional rewarding for sloppy behavior. My approach ensures that the parrot knows which good or improving cases it is being rewarded for and which ones it isn't.
[quote="Pralina":xqeyudme]Also, wouldn't you think that this way, gradually with time, the bird will start misperforming more and more tricks out of the sequence ?[/quote:xqeyudme]
No. That makes no sense. They get better over time. They get to practice the behavior much more times. Also they are used to working for a thinner reinforcement margin. In other words, it takes less motivation to get the parrot to perform when it is used to getting a single treat for 10 tricks than the parrot that gets one every time. This allows for some great differential reinforcement and makes continuous reinforcement even more effective!
[quote="Pralina":xqeyudme]Unless in such a case, you would go back and re-train / practice that one trick only, with a one click / one reward technique, until the bird gets it systematically right every time.[/quote:xqeyudme]
Well there's no doubt the bird has to be doing it right in the first place. But I really can't think of many cases where I've ever had to go back to continuous reinforcement for a trick that was put on VR.
Look, it's one thing to talk about it and postulate that it won't work and another to actually see it happening. It's not a method I recommend for most people but it is by far the most effective when used properly of everything I've come across. If you try to think about VR, you'd also come to the logical conclusion that less treats means less performance but when you actually apply it, you realize this is not the case.
Thanx for your reply and I really do understand the point you make.
However, don't get me wrong, but the reason why Im asking so many questions is because I am myself reading a lot and other people's theories as well, and I came accross this article right before you created this post and your blog entry. So I find it interesting how... contradictory both approaches are.
The article I am talking about is Blazing Clickers by Steve Martin and Susan Friedman that was brought to my attention by another parrot trainer.
http://www.naturalencounters.com/docume ... sFINAL.pdf
When persistence is required, the best approach is to first teach the new behavior with
continuous reinforcement (click-treat) for the clearest communication of the behavior-consequence
contingency. Next, gradually thin the reinforcers over time (known as stretching the reinforcement ratio)
to the desired variable schedule changing the amount of behavior unpredictably while increasing the
amount of behavior required for reinforcement overall. For example, if a trainer wants a lion to make
several trips to a public viewing window each day, a variable ratio schedule of reinforcement (i.e., the
click-treat together, no solo clicks![/b:t0j6ymeg]) would be the right tool.[/quote:t0j6ymeg]
Therefore, and Im just asking here, doesn't it sound like it makes more sense to define the variable ratio by BOTH the schedule of the click and the treat, instead of ONLY the schedule of the treat?
Because in the end wouldn't you want from the parrot to actually DO a behavior when you ask it (cue it) and therefore understand the cue without necessarily having to receive a click or treat for it every time?
What I mean is, you want to be able to say "Kili" and she flies to you no matter what, right? Not only in a training session, but just about all the time, without having to earn a click or a treat. So therefore wouldn't it make more sense to withold BOTH if using a variable ratio training?
[quote="Michael":252huwqc]For example I can execute the following sequence with the following outcomes:
turn around/incomplete/no click
wings/good/click & treat[/quote:252huwqc]
You know, come to think of it, seems to me that you're using differential continuous reinforcement the whole time. After all, every single correct response that meets your criteria gets a click, yes? And the reward from the click is either a treat, or the opportunity to do another trick (which, considering how consistently and often you train, should probably also be considered a secondary reinforcer).
If you were REALLY using VR, it would look something like this:
Where even some correct performances no longer get a click.
Pralina - You bring up some excellent points! You pointed out that birds will not always knows when they did a behavior "poorly" when on a Variable Ratio (VR) reinforcement schedule! Because we know it takes a lot longer to teach a new behavior by using VR, like teaching a bird how to wave, we stick to a Continuous Reinforcement (CRF) schedule until the trick is exactly how we like it. So as you said, this means that if a behavior was performed poorly we go off VR and back onto CRF until the "poor" behavior is eliminated by extinction (this is known as Differential Reinforcement of Alternative behavior DRA). Right on the money!
Micheal doesn't realize it but this is what he is doing when he does not click for the "poor" response. The only hard and fast rule is that once the good response does occur you always want to reinforce it so that the "poor" behavior goes on extinction via a DRA procedure.[/b:3hnxuer5]
You said that a good idea would be to re-teach the behavior on CRF in a separate training session. And indeed that can be done if the behavior is REALLY BAD! But say that an Amazon says the wrong word during a show, what do you do? You wait until he says the right word and then give the treat! It cannot be emphasized enough to always, always, always[/i:3hnxuer5] give a reinforcement when the bird says the right word, when before he didn't. Again, this is called Differential Reinforcement of an Alternative Behavior.
I could probably get Kili to run through her entire 20 tricks routine now for just a single treat. Since she gets clicks throughout the process[/b:3hnxuer5], she both realizes she is doing things right and is encouraged to keep trying. If you try to get the animal to perform 10-20 behaviors without offering any feedback, there is a greater tendency to just figure it is doing it wrong and give up. [/quote:3hnxuer5]
You keep touting that this idea is unique and has never been done before. It has and continues to be used by cetacean trainers. Remember those whistles that they blow every few seconds? That is the equivalent of your click. Yeah, the whales don't get a fish when they are breaching and swimming either, just like your birds don't get a treat every time.
The thing is cetacean trainers are criticized for doing this. Why?
1) They claim it to be Variable Reinforcement.
2) They run into problems down the road. (slower less enthusiastic behaviors, "refusal" to do a behavior)
3) Those whistles are annoyingly unnecessary. Maybe it makes them feel important blowing a whistle every five seconds?
As already explained further up this thread, variable and continuous reinforcement cannot be combined together, it has to be one or the other. If a trainer tries to combine it the schedule of reinforcement always comes out to be a CRF! There is no avoiding it.
According to researchers John M. Pearce, Edward S. Redhead, and Aydan Aydin from the University of Wales in a paper titled "Partial Reinforcement in Appetitive Pavlovian Conditioning with Rats" the theory of Extinction applies a little differently when used with partial vs. continuous reinforcement.
What they found (and has been confirmed even before this paper) is that if extinction is applied to a behavior on CRF, that behavior breaks down really quickly. But if extinction is applied to behavior on partial reinforcement, that behavior breaks down really slowly.
So what does this mean for you? Think about how strong Kili's behavior is. You relate that to the clicker being on a CRF. But is it really? With the above knowledge I can come up with a test for Kili. Test to see what happens when the click stops occurring and you only give a food reward on a VR schedule as usual. If the behavior breaks down quickly then the click is a CRF. If not then something else is going on here.
My bet is that the bird is currently "tuning out" the click as extraneous information. The bird's cue has likely evolved into your hand moving towards Kili for the delivery of the treat.
But by stating that "By using my approach....The behavior becomes extremely robust and resistant to extinction." you have already outed the answer to my test. Your clicker is NOT on Continuous Reinforcement even though it looks like it is. And this can mean only one thing, it's not acting as a reinforcer anymore. Your bird is tuning it out.
It's not magic, it's science. We know that when a clicker (a secondary reinforcer) stops being paired with treats (a primary reinforcer) the click no longer stays a secondary reinforcer.
Those crazy researchers! What will they think of next!?