To compute the shortest path between two points in a graph, we first
define a HiLog predicate short_path(Source,Target)
, that given
two nodes returns short paths from the first to the second. There may
be several short paths between two nodes, but we will be sure that one
of them must be the shortest path:
% There's a short path if there's an edge, short_path(X,Y)(D) :- edge(X,Y,D). % or if there is a short path to a predecessor and then an edge. short_path(X,Y)(D) :- bagMin(short_path(X,Z),D1), edge(Z,Y,D2), D is D1 + D2.The first clause says that there is a short path from
X
to
Y
of length D
if there is an edge from X
to
Y
with weight D
. The second clause says there is a
short path from X
to Y
of length D
if we take the
minimum of the short paths from X
to a predecessor (Z
)
of Y
and we get D
by adding the distance along the edge
to Y
from the predecessor.
Now to get the shortest path, we simply take the shortest of the short paths:
% The shortest path is the minimum of the short paths shortest_path(X,Y,D) :- bagMin(short_path(X,Y),D).
This program in fact works for cyclic graphs, as long as all loops
have nonnegative distance. To see why it works, we must look at it
more closely. Normally we think of computing an aggregation by
creating all the elements in the bag, and then performing the
aggregation on the entire set. However, doing that here, with a
cyclic graph, would result in a bag with infinitely many elements,
since there are infinitely many different paths through a cyclic
graph. It is clear that we can't construct and test every element in
a bag of infinitely many elements. bagMin
must return an
answer before it has seen all the elements. Notice that if a graph
has a self-loop, say from node a
to node a
, then a
D
such that short_path(X,a)(D)
is defined in terms of
the minimum of a bag that contains D
itself. This turns out to
be well-defined, because the minimum operator is monotonic. It works
computationally because in the case of recursive definitions, the
bagMin
may return an answer before it has seen all the answers.
At any point it returns the best one it has seen so far: if another
one comes along that is better, it returns that one; if another comes
along that is no better, it just ignores it, and fails back to find
another.
So the order in which answers are generated can effect how much computation is done. If poor answers are returned first and many paths are computed using those poor answers, then all that work is unnecessary and will have to be done again with improved answers. Whereas if the best answer is returned first, then much less total computation will have to be done. So the complexity of this routine is dependent on the scheduling strategy of the underlying engine. We will look at these issues more later.