Preview
In this lesson we cover another method of learning, called learning by correcting mistakes.
- An agent reaches a decision.
- The decision turns out to be incorrect, or sub optimal.
- Why did the agent make that mistake?
- Can the agent correct its own knowledge and reasoning so that it never makes that same mistake again?
Example 1 (change lane) I’m driving and I decide to change lanes. As I change lanes, I hear cars honking at me. Clearly I made a mistake.
- What knowledge,
- What reasoning led to that mistake?
- Can I correct it, so that I don’t make the mistake again?
Learning to correct mistakes is our first lesson in meta-reasoning. We’ll start today by revisiting explanation-based learning. Then we’ll use the explanations for isolating mistakes. This will be very similar diagnosis.
Except that here we’ll be using explanation for isolating mistakes. This will make it clear, why explanation is so central to knowledge-based AI. Then we’ll talk about how we can use explanations for correcting mistakes which will set up the foundation for subsequent discussion on metareasoning.
Exercise Identifying a Cup
To illustrate learning by correcting mistakes, let’s go back to an earlier example.
We encountered this example when we were discussing explanation-based learning.
So imagine again that you have bought a robot from the Acme hardware store, and in the morning, you told your robot, go get me a cup of coffee. Now the robot already is bootstrapped with knowledge about the definition of the cup.
A cup is an object that is stable and enables drinking. The robot goes into your kitchen and can’t find a single clean cup. So it looks around.
This is a creative robot and it finds in the kitchen a number of other objects.
One particular object has this description. The object is light and it’s made of porcelain. It has decorations and it has concavity and a handle. And the bottom of this object is flat. Now the robot decides to use this object as a cup, because it can prove to itself that this object is an instance of a cup.
It does so, by constructing an explanation. The explanation is based on the fact that the bottom is flat, that it has a handle, that the object is concave, and that it is light. Let us do an exercise together that will illustrate the need for learning by correcting mistakes. So shown here are six objects. And there’re two questions here. The first question is, which of these objects do you think is a cup? Mark the button on the top left if you think that a particular object is a cup.
The second question deals with the definition of a cup that we had in the previous screen. So mark the button on the right as solid, if you think that that particular object meets the definition of the cup in the previous screen.
Exercise Identifying a Cup
Let us build on David’s answers, let us suppose that the robot goes to the kitchen and finds this pail in the kitchen. It looks the pail and decides that this pail meets this definition of a cup, denoted here by the solid circle. The robot brings water to you in this pail. You look at the pail, and you say to the robot, no robot, this is not a cup. At this point you would expect a robot to learn from its failure.
Cognitive agents do a lot of learing from failures. Failures are opportunities for learning. We would expect robots, and intelligent agents more generally, to learn from their failures as well. How then may a robot learn form the failure of considering this fail as a cup.
Note that the problem is not limited to this particular fail. We can take a different example connecting with this particular cup. Imagine that the definition of cup included a statement that it must have a handle. In which case, the robot may not recognize that this is a cup.
Later on you may teach the robot, this in fact is a good example for cup, because it’s liftable. In that case, the robot will want to understand from that failure. It will want to understand why did it not consider it to be a cup?
It should have considered it to be a cup. So the problem is not just about successes that turned out to be failures. But also about failures that should have been successes.
Questions for Correcting Mistakes
This problem of identifying the error in one’s knowledge that led to a failure is called Credit Assignment. Blame assignment might be a better term.
A failure has occurred, what fault of gap in one’s knowledge was responsible for the failure? This blame assignment. In this lesson we’ll be focusing on gaps or errors in one’s knowledge. In general, the error could be in one’s reasoning or in one’s architecture. Creative assignment applies to all of those different kind of errors. Several accurists, Marvin Minsky for example, consider creative assignment to be the central problem in learning.
This is because air agents live in dynamic worlds.
Therefore we’ll never be able to create an air agent which is perfect. Even if we were to create an air agent which had complete knowledge and perfect reasoning related to some world, the world around it will change over time. As it changes, the agent will start failing. Once it starts failing, it must have the ability of correcting itself. Of correcting its own knowledge, of correcting its own reasoning, correcting its own architecture.
You can see again, how this relates to the matter of cognition. The agent is not diagnosing some electrical circuit or a car or software program outside. Instead, it is self diagnosing, self repairing.
Visualizing Error Detection
As we mentioned previously, in journal, the editors may lie in the knowledge, in the reasoning, or the architecture of a nation. And therefore, learning and correcting errors might be applicable to any one of those.
However, in this lesson, we will be focusing only on errors and knowledge. In fact, in particular, we’ll be focusing on errors and classification knowledge. Classification, of course, is a topic that we have considered this class repeatedly. Let us consider an air agent that has two examples of executing an action in the world.
In the first experience, the agent will use this object as a cup and gets the feedback, this indeed was a cup. So this was a positive experience.
This, on the other hand, is a negative example. Here, the agent viewed this as a cup and got the feedback that this should not have been viewed as a cup.
We can visualize the problem of identifying what knowledge element led the agent to incorrectly classify this as a cup. As follows. This left circle here consist of all features that describe the positive example. The circle on the right consist of all features that describe the negative example. So features in this left circle might be things like this is a handle, there is a question mark, there is a blue interior and so on. The circle on the right consist of features.
That characterize a negative example. It has a movable handle. It has a red interior. It has red and white markings on the outside. There’s some features that characterize only the positive experience and not the negative experience. There are those that characterize only, the negative experience, and not the positive experience.
There are also many features that characterize both the positive and the negative example. For example they are both concave, they both have handles, and so on. In this example, it is these features that are specially important.
We’ll call them fault suspicious features. We call them fault suspicious features because first they identify only the negative experience. Secondly, one or more of these features may be responsible for the fact that the agent classified this as a positive example when in fact it was a negative example.
As an example suppose that this features corresponds to a movable handle. This is a false suspicious feature. It is false because this experience was false.
It is suspicious because it does not characterize the positive experience.
And thus it may be one of the features responsible for the fact that this was a negative example. But now there’s an additional problem. There are number of false suspicious features here. So how will the agent decide which false suspicious feature to focus on? We’ve encountered this problem earlier, when we were talking about incremental costs of learning. At that point we had said that we wanted to give examples in an order, such that each succeeding example referred to the current constant definition, and exactly one feature.
So that the agent knows exactly what the focus that feature is on. The same kind of problem occurs again how might the agent know which feature to focus on. One possible idea is that it could try on of the feature at a time and see if it will work. That is it could select this feature to repeat the process get more feedback and either is accepted or eliminated. An alternate method is that the agent perceived not just two experiences. But many such experiences. So there were other positive experiences that covered this part of the circle.
That would leave only this as a false suspicious feature, and then the agent can focus attention on this feature. As an example, just like this circle may correspond to a movable handle, this may correspond to red interior. Because red interior is one of the features that characterizes a negative example and not a positive example. But later on, there might be another positive example that comes of a cup, which had a red interior in which case agent can exclude this particular feature. The reverse of this situation is also possible.
Let us suppose that the agent decides that this is not a cup, perhaps because its definition says that something with a blue interior is not a cup. And therefore, it doesn’t bring water to you inside this cup and tells you there is no cup available in the kitchen. You go to the kitchen.
You see it and you say, well, this is the cup. Now, the agent must learn why did they decide that it was not a cup. In this case, the relevant features are these three features. These are the three features that define this cup, but do not define the other experiences. So this dot may correspond to a blue interior, this dot may correspond to a question mark on the exterior, we’ll call this feature true suspicious, just like we call them false suspicious.
These are the features that prevented the agent from deciding that this was a positive example of a cup. One or more of these features may be responsible for the agent’s failure to recognize that this was a cup
Error Detection Algorithm
As you can see, we’re taking unions, and intersections of features characterizing different examples. The number of examples, both positive and negative. Needed for this algorithm to work well, depends upon the complexity of the concept. In general, the more features you have the description of the object, more will be the number of the examples you will need to identify the features that were responsible for a failure.
Explanation-Free Repair
So let us look at the result of the kind of learning technique we are discussing here. Here might be the old concept of a cup. In this particular case, this particular concept definition has been put in the form of a production rule. Here is the new, concept definition for a cup. This is almost identical to the previous definition, except that now the object not only has a handle, but also the handle is fixed. This is similar in many ways to incremental concept learning.
You may recall that in incremental concept learning that any particular stage of processing there was a concept definition. As new examples came, then the concept definition changed depending on the new example and the current concept definition. Note that in this matter of concept revision, the number of features in this if clause may become very, very large, very quickly. Here we have object has a handle and handle is fixed.
We could keep on adding additional features, with the interior is blue, that cover all the positive experiences. The difficulty is, at the present time there is no understanding for why the fact that handle is fixed is an important part of the cup definition. This requires an explanation.
Why is it that the handle being fixed is an important part of the definition of a cup? This is one of the key differences between knowledge-based AI and other schools of AI. Classification is ubiquitous in many schools of
AI as we have discussed earlier. Explanation however is a key characteristic of knowledge-based AI. Explanation leads to deeper learning. It not only says here are the features that result in a concept definition, it also says and here is why these features are important for a concept definition.
This brings us to the second question on learning from failures. Now we want the agent to explain why a particular fault in its knowledge led to its failure.
Explaining the Mistake
You may recall this explanation from our lesson on explanation-based learning.
There, the agent constructed an explanation like this to show that a specific object was an example of a cup. For a good example of a pail, the agent may have constructed similar explanation with the object being replaced by pail everywhere. Now however, the agent knows that pail is not an example of a cup.
Something is not quite right with this explanation. We’ve also just seen how the agent can identify the false suspicious relationship in this explanation.
It is identified, but the handle must be fixed because that is the feature that separates the positive experiences from the negative experiences. The question then becomes, how can this explanation be repaired by incorporating handle as fixed? Where should handle as fixed go?
Discussion Correcting the Mistake
What do you think, is this a good way to fix the agents error?
Discussion Correcting the Mistake
David’s answer was good, but not necessarily optimal. I think he put handle as fixed at this particular place in the explanation because the first reason here.
He wanted to capture the notion, that only fixed-handle cups enable drinking.
And that this spot does it. He wanted to capture the notion that only fixed-handle cups enable drinking and this scenario does that. However, this not an optimal way of fixing this particular explanation. However, this is not the best way of fixing this particular explanation because it leads to additional decisions. It suggests that, only those things which have a fixed handles are liftable, but of course, from the pail we know, that the pail does not have a fixed handle and yet the pail is liftable. This suggests that handle is fixed should go somewhere else. Perhaps above, object is liftable, but still below object enables drinking. Agent two can figure out that this is not the optimal case for putting case as handle as fixed. Because A, it voided its notion of pail, a pail is liftable, but has a movable handle.
It also voided this notion of a briefcase. This part was coming from a person of a briefcase. The briefcase with removable handle, and it to is liftable. This is how the agent knows this handle is fixed should go somewhere else, in this explanation. Above object is liftable, but beneath object is enables drinking.
Correcting the Mistake
So the [UNKNOWN] agent figured out that handle is fixed should go beneath object enables drinking, but not beneath object is liftable. So the agent will put handle is fixed here, in this particular explanation.
This is correct. If the agent has background knowledge, it tells it that the reason handle is fixed is important is, because it makes the object manipulable, which in turn enables drinking.
Then the agent can insert this additional assertion here in the explanation.
Even more, if the agent has additional background knowledge that tells it, the defect of the object has a handle, and that the handle is fixed, together make the object orientable, which is what makes the object manipulatable, then the agent may come up with a richer explanation. The important point here is, that as powerful and important as classification is, it alone is not sufficient.
There’re many situations under which explanation too is very important. Explanation leads to richer learning, deeper learning.
Connection to Incremental Concept Learning
There is one more important point to be noted here. This again is the illustration from incremental concept learning. When we were talking about incremental concept learning, we talked about technique for learning. We did not talk about how this concept were going to be used.
How about in correcting mistakes, we’re talking about how the agent actually uses the knowledge it learns. This point too, is centered reason. The first reason is that knowledge-based AI, looks at reasoning. Looks at action, besides how knowledge is going to be used.
And then, determined what knowledge is to be learned. Assess the target for learning, secondly you may recall this particular figure for the target of architecture that we’d drawn earlier. You may see that reasoning, learning and memory are closely connected, and all of that is occurring on the surface of action selection. This figure suggests that we not only learn, so that we can do action selection. But additionally, as we do action selection, and we get feedback from the world, it informs the learning.
As this figure suggests, intelligent agents, cognitive systems, not only learn, so that they can take actions on the world. But further, that the world gives some feedback, and that feedback informs the learning. Once again, failure a great opportunities for learning. One additional point to be made here, learning by correcting mistakes, use learning as a problem-solving activity.
An agent meets failure, it needs to learn from the failure. It converts this learning task into a problem-solving task. Let us first, identify what knowledge are related to failure. Then, let us build and explanation for this. Then we’ll repair it. This learning is closely intertwined with memory, reasoning, action, and feedback from the world. Notice also, that there’s reasoning, learning, and memory here. In the deliberation module closely connected with the metacognition module. Here, the reasoning, memory, and learning may be about action selection in the world.
But in a metacognition module may have its own reasoning, learning, and memory capacities. And some of the learning in the metacognition is about fixing the errors in the deliberative reasoning. So, metacognition is thinking about thinking. The agent uses the knowledge, to think about the action selection, and it conducted those actions in the world. Metacognition is thinking about what went wrong in its original thinking. What was the knowledge error?
We’ll return to metacognition in the lesson on metareasoning.
Assignment Correcting Mistakes
So how would you use learning by correcting mistakes, to design an agent that can answer Raven’s progressive matrices? On one level, this might seem easy. Your agent is able to check to see if its answers are correct, so it’s aware of when it makes a mistake. But the knowledge of when it’s made a mistake, merely triggers the process of correcting the mistake.
It doesn’t correct it itself. So how will your agent isolate its mistake?
What exactly is it isolating here? Once it’s isolated the mistake, how will it explain the mistake? And how will that explanation then be used to correct its mistake so it doesn’t make the same mistake in the future? Now in this process we can ask ourselves, will your agent correct the mistake itself?, or will you use the output to correct the mistake in your agent’s reasoning?
Will you look at what it did and say, here’s the mistake it made. So next time it shouldn’t make that mistake.
If you’re the one using your agents reasoning to correct your agent, then as we’ve asked before, who’s the intelligent one? You? Or your agent?
Wrap Up
So today, we’ve been talking about learning by correcting mistakes.
We started off by revisiting explanation based learning and incremental concept learning, in those lessons we were dealing with a small number of examples coming in one by one. We dealt with the same thing here, but here we had the additional feedback about whether or not our initial conclusion was right or wrong. We then talk about isolating mistakes.
This was similar to our problem diagnosis. Given a mistake how do we narrow down our reasoning to find where that mistake occurred.
Then we talked about explain the mistake.
This is key both to building human like agents and to enabling the agent to correct its mistake. Only by explaining the mistake can the agent correct it efficiently. And after those phases correction becomes a much more straightforward task. Next time we’ll talk about meta-reasoning in more detail.
For example here we’ve discussed mistakes in knowledge but what about mistakes in reasoning? What about mistakes in architecture? What a bout gaps instead of mistakes where we don’t know something instead of knowing something incorrectly?
We’ll talk about all that next time.
The Cognitive Connection
Learning by correcting mistakes is a fundamental process of human learning.
In fact, it may closely resemble the way you and I learn and practice. In our lives, we rarely are passive learners. Most of the time we’re active participants in the learning process. Even in a difficult setting like this, you’re not just listening to what I’m saying.
Instead, you’re using your knowledge and reasoning, to make sense of what I’m saying. You generate expectations. Sometimes those expectations may be violated.
When they’re violated, we generate explanations for them. We try to figure out, what was in error, in your knowledge and reasoning. This, is learning by correcting mistakes. Notice, that you think about your own thinking, a step towards meta-reasoning, which is our next lesson.
Deep, RL and Bayesian Learning
A second area where this manifest is when we check and see that transformers and other attention based models perform poorly due to a failure to attend to the neccessery knowledge that is available. The under laying issues can be due to vanishing gradients which create an inverse law of retrival capability with distance from other query items.
Correcting mistakes in Reasoning
Reasoning Gaps often manifest as hallucinations in LLMs. While there may be many reasons for this one of my hypothesis is that this could be a reasoning gap. Once there is a fault in reasoning in it state, the AI can seems to be unable to recover. You can point it out and ask to work around but it seems that once the sate has a contradiction anything can be proved and LLM cannot recover.
This suggest that we should research how we can use method like correcting mistakes in deep models This might include
the ability to detect various fallacies at the state level,
the ability to backtrack inference to a point before the failure
to add a constraint or an oracle preventing the contradiction from recurring
for transformer we may add some attention heads that are available for this tasks: - bias correction head - circular reasoning head
However, the fact that LLM approximate probabilistic distribution is also an issue unless the underlying distribution it approximate more than just a language model. - For example we may want to model an error correcting language model. - Perhaps another possibility is to have a logos backbone that would be tasked with - keeping track of large number of logical atoms - keeping track of relation - keeping track of thier validity context. This is a language-logic interface. - keeping track of different modes of reasoning - Detecteing fallecies, contradictions, wishful thinking etc.
However incremental concept learning and learning by correcting mistake build reasoning trees that with atomic branches. The atomicity facilitate isolating a part of the tree that is at fault. Once this diagnosis points the problem in the formal model - it can be revised.
Reuse
Citation
@online{bochman2016,
author = {Bochman, Oren},
title = {Lesson 23 {Learning} by {Correcting} {Mistakes}},
date = {2016-02-17},
url = {https://orenbochman.github.io/notes/cognitivie-ai-cs7637/23-learning-by-correcting-mistakes/23-learning-by-correcting-mistakes.html},
langid = {en}
}