Learning is a capability that an agent must have if it is ever to carry out a task more effectively than in previous attempts (accidental successes aside).

An agent incapable of learning will always react the same way to the same environment given the same task. A learning agent, on the other hand, may be able to take advantage of experience to improve its reaction to the situation.

Agents not capable of learning have other disadvantages, as noted by Lashkari et al. (1994 p.1). Such agents must be provided with the information they need by some other method, which means laborious pre-programming and subsequent maintenance of the information as circumstances change. In some cases it may not be possible to pre-load the required information into an agent, because the information may not be available - as in the case of any agent that must adapt itself to an as-yet-unknown human user.

We must make a distinction here between learning and mere data storage. Learned information is able to be used by the agent. A news-gathering agent, for example, might locate a news article about some world event and store the article for later transmission to the user, but the information about the event is not available to the agent. The fact of location (and many other items of information about the news article such as its size [1]) are available to the agent and may well be relevant to its actions and be used by it in directing its actions, but the content of the article is not.

To be properly classifiable as learned, an item of information must be not merely obtained, or even acted upon, but must also be (at least in principle) able to be used later in the execution of another task. Mere storage ability does not constitute learning.

One kind of learning, adaptiveness (Etzioni, 1995), is the ability to be moulded to fit a particular (human or otherwise) user or group of users. Foner (1993, p.35) regards this attribute (which he calls "personalizability") as essential. He notes that not all users do the same tasks, and that different users may do the same task in different ways. To properly represent a user, therefore, an agent must be able to be shaped to fit the way that the specific user would carry out the task which he or she is delegating to the agent.

Assuming an agent to be capable of accumulating information, where does the information it accumulates come from? (To avoid endless circumlocution, terminology such as "know" and "fact" will be used for the remainder of this chapter.)

There are really only four ways for information to arrive:

  1. It may be built in to the structure of the agent. This is information provided by the creator of the agent, or in some way delivered directly into whatever storage for such things the agent may have, prior to or immediately upon execution.

  2. It may be perceived directly from the environment. Such a fact might be "at such-and-such a time the temperature was 23 degrees Celcius".

  3. It may be obtained from another entity (such as another agent).

  4. It may be derived from other information already known. For example, if several consecutive temperature measurements are taken and each reveals a lower temperature than the last, the agent may derive the fact that the temperature is falling.

The first of these methods is not really "learning" as such, and I will not treat it further beyond noting that the distinction between the first and third methods is blurred in circumstances where agents are sharing information directly in the ways described by Fulbright and Stephens (1994). In such a case, the participating agents could be seen to be learning as a group.

The last three methods have something in common: The accumulation of knowledge will take time, for it can arrive only through experience. We could say that derived information does not require experience, but in practice this is not so. If the agent were to attempt to derive all derivable facts from whatever a priori knowledge it had when it began execution, it would suffer a combinatorial explosion in all but the most trivial of cases. If the agent was able to derive meta-facts (facts about the facts), even a trivial agent would have a potentially infinite number of facts at its disposal! Experience - and thus the passage of time - is required for the agent to be able to know what derivable facts are relevant ("relevant" meaning related to and useful in the execution of its assigned tasks).

So what learned information is relevant? While the basic premise of learning is the accumulation of information for use later, there are several domains in which an agent might accumulate relevant information - about its user, about its environment and about itself.

It would be easy to say (as Goodwin does (1993 p.4)) that the client is simply part of the environment and thus collapse the first two into one. The information to be gathered from the client is qualitatively different to information to be gathered from the environment. The former is information about what the task is, while the latter is part of actually doing the task. Making this distinction requires us to also distinguish between information about the client and information provided by the client - information about the client may well be part of the environment.

Since experience is necessary for a learning agent to accumulate relevant information, it is clear that time will be needed in which to gain that experience. The more learned information that is needed, the longer that experience will take to gain. In cases where the task relates largely to learned information - such as an information filter - a serious flaw is exposed: Until the required experience has been gained the agent will not perform very well. Unfortunately it must continue to be used otherwise it will not gain the required experience. The people who have laboriously trained Apple Newtons [2] to recognise their handwriting will attest to the seriousness of this flaw, other aspects of which will be discussed further in a later section.

To the extent that learning agents rely upon learned information to function effectively, they are also by definition restricted to dealing effectively with situations that they have experienced before, or at least situations that are similar to previously experienced situations.

Lashkari et al. (1994, p.1) sum these problems up very neatly and suggest collaboration between agents as one way to overcome them. While tightly coupled agents such as Fulbright and Stephens' Type 2 through Type 7 models could collaborate by directly sharing information, collaboration in the sense intended by Lashkari et al. certainly requires communication between agents at a much higher level.

In a collaborative environment, agents request and obtain information from other agents; thus a brand new agent may have access to relevant information without having to obtain it through personal (so to speak) experience.

In Goodwin's terms, learning as it has been considered is the process of changing the state (or perhaps it would be better expressed as the contents) of the internal world model of a deliberative agent. What of the situation where the agent changes the model itself in response to the environment or other acquired information? This is a quite different situation, and I am not sure that it can be called "learning". If we consider learning about learning, what about learning about learning about learning...?

The area of genetic algorithms is interested in finding appropriate solutions to problems using an evolutionary approach, which is based on changing the structure of agents to better carry out their tasks. It is not necessarily learning as we have defined it above, but it is certainly a form of adaptiveness and a mechanism by which the experience of an agent (or rather the experience of a type of agent) produces changes in the effectiveness of the agent, so it is worth a mention in this context. An excellent introduction to the basic ideas behind genetic programming can be found in (Cantú-Paz, 1995 pp.1-2). A step-by-step introduction to some of the techniques used is found in (Dewdney, 1988).

Basically the approach consists of placing several software entities in an environment, providing some mechanism for them to change, then allowing them to interact. Periodically those which are least successful are removed. Properly constructed, such a system creates increasingly successful entities, mimicking the way biological evolution creates ever more successful living creatures.


Though not necessary for an agent to be effective, learning is necessary for an agent to become more effective. In many cases, learning is the only practical way for an agent to become effective at all.
[1] For example, the physical size of the article might be such that the agent must cease searching and report back to the user to "unload". Some internal content may be available such as the date and time of the article, which might be used to decide whether or not to gather the article at all. In general however, the content of the article will not be relevant, at least not to agents that are currently realisable!

[2] The Newton is a small handheld computer produced by Apple Corporation. It has a touch-sensitive layer laid over a small LCD panel. An inkless stylus is used to "write" on the touch-sensitive layer. The Newton displays pixels on the LCD panel corresponding to those places touched by the stylus - this provides visual feedback on the path traced by the stylus. The Newton then attempts to recognise the traces as alphanumeric characters - i.e., to recognise the handwritten text as letters and numbers. While in general successful, the Newton takes quite a bit of "training" before it can reliably translate its owner's handwriting.

 [previous]  [index]  [next]
Email me!
Last modified 23 December 1995, 13:30
© Copyright 1995 Karl Auer