Module 5
The anatomy of lists and their processing

 

 
What does a list look like?
 

Many programming languages include lists as one of their data types. Lists are an invaluable part of artificial intelligence programming languages and the name of the major AI language, LISP, is a contraction of "LISt Processing". Different programming languages use different notations to express lists. Prolog lists look like:
     [m42, m40, m25, a21]
Briefly, it consists of items (called "members" or "elements") inside square brackets separated by commas. In this example, each element is an atom, but Prolog doesn't restrict elements to the atom data type. For instance, lists can be lists of lists:
     [[camel, emu, deer], [tiger, bear, bear]]
or lists of structured objects:
     [road(m42, [j1, j2, j3, j3a]), road(m40, [j16, j15, j14, j13, j12, j11])]
or a mixture of data types:
     [road(m42, [j1, j2, j3, j3a, j4, j5]), road(a42, [twycross]), zoo, 
                [marmoset, tamarin, lemur]]
List can even be empty, in which case they look like:
     []

 
Why use lists?
 

Lists are an easy way of working with unpredictable situations. They are invaluable in situations where it is difficult, impossible or not worthwhile to predict data to be stored and manipulated. In many artificial intelligence programs, the number of partial solutions to a problem changes dynamically during computation. For instance, near the beginning of a session with a patient, a medical diagnosis system might consider a small number of possible illnesses, which expands to a greater number when more symptoms are discovered, and which finally settles down to one or two possible illnesses. A list would be an ideal data structure for such an application.
 
 
Unification and lists

All the examples of unification seen so far have included only predictable, fixed data structures, such as atomics and structured objects. We can unify one list with another:
     | ?- [a, b, c] = [Elem1, Elem2, Elem3].

     Elem1 = a,
     Elem2 = b,
     Elem3 = c ?
and because an uninstantiated variable will unify with anything, we can unify a list with a variable:
     | ?- Var = [a, b, c].

     Var = [a,b,c] ?
As these two examples show, we haven't achieved much. In the first example we had to know exactly how many elements there are in the lists: in the second example we just instantiated Var to the complete list, rather than accessing parts of it.

In order to be able to use unification with lists of unknown length, we must have an extension to the list notation and so to our way of thinking about lists.

Rather than thinking of a list as consisting of a certain number of elements, we can think of it as having a head and a tail. At the simplest, the head is the first element of the list and the tail is the remainder. In Prolog, we use the "list constructor" (written as "|") to denote the head and tail.

     | ?- [a, b, c] = [Head|Tail].

     Head = a,
     Tail = [b,c] ?
The important thing to notice in this example is that the tail of the list is itself a list. In this example, there was one element before the list constructor, but it is possible to have more than one:
     | ?- [a, b, c] = [Head1, Head2|Tail].

     Tail = [c],
     Head1 = a,
     Head2 = b ?
Again, note that the tail of the list is still a list in its own right, even though it only has one element. What happens if we extend this example to include three elements before the list constructor?
     | ?- [a, b, c] = [Hd1, Hd2, Hd3|Tail].

     Hd1 = a,
     Hd2 = b,
     Hd3 = c,
     Tail = [] ?
All three elements are unified with variables before the list constructor, so there is nothing with which Tail can be unified. This is shown by Tail being instantiated to the empty list '[]'.
 
Take time to work through the Self-Test 1
Take time to work through the Self-Test 2
 
How are lists processed?
 

We've already stated that the main reason for using lists is their capability of expressing unpredictable and dynamic situations. However, list processing algorithms must be predictable. This may seem paradoxical, but the situation is retrieved by a neat programming trick.

The most common pattern of list processing divides a list into a head and a tail. Some operation is done on the head, and then the tail is itself divided into a head and a tail and the operation repeated anew. This process can go on until a terminating condition is reached. The most obvious terminating condition is when the list is empty. So, given a list [a, b, c], it would be reduced in four steps:

  1. Head = a, Tail = [b, c]
  2. Head = b, Tail = [c]
  3. Head = c, Tail = []
  4. List empty ("[]").
The processing of this list can be drawn as:

The shape of this tree suggests that a procedure for working through the list would have the same pattern as a procedure for exploring the network in Modules 3 and 4. That procedure had two rules, of which the first was the terminating condition and the second was the recursive condition.

The procedure for traversing the network was:

     % 1
     route(Start, End) :-
          path(Start, End).
     % 2
     route(Start, End) :-
          path(Start, Via),
          route(Via, End).
The first thing we need to do is to write down the form of the head of the rules. The first rule will have the form:
     % 1 terminating
     dissect_list([]) :-
The second rule will have the form:
     % 2 recursive
     dissect_list([Head|Tail]) :-
Next we have to write the body of the rules. The body of the first rule will be:
     write([]), nl.
The body of the second rule will be:
     write(Head), 
     nl,
     dissect_list(Tail).
Putting this together, we have the procedure:
     % 1 terminating
     dissect_list([]) :-
          write([]), nl.
     % 2 recursive
     dissect_list([Head|Tail]) :-
          write(Head), 
          nl,
          dissect_list(Tail).
Take time to work through the Self-Test 3