Declarative Logic Programming. Michael Kifer
Читать онлайн книгу. The first goal here can be matched to the fact cited(dsw, dm)
, yielding the goal
Matching this subgoal to the first rule, and then matching the result to the fact cited(tt, dsw)
gives the additional query answer
There are some choices in the top-down approach, in terms of the order that goals and rules are considered. The standard for logic-programming languages is to proceed depth first, always working on the first subgoal, and trying rules in order. However, there are other strategies, such as choosing the subgoal to work on based on the number of bound or unbound variables, proceeding breadth first.
1.5.2.1 Memoing (Tabling) Approaches
One big problem with top-down computation is that it can lead to an infinite recursion because a subgoal may be defined in such a way that it depends on itself. Consider the subgoal influenced(dsw, X)
above. If we match it against the second (recursive) rule for influenced
and match the first subgoal of that rule with the fact cited(tjg, dm)
, we get the subgoal
When we use the second rule on this new goal, we can match its first subgoal in the body of that rule with cited(dsw, tjg)
, we get the already seen subgoal
Thus, a naïve top-down strategy would lead to an infinite loop.
Resolving a subgoal repeatedly as it arises in different contexts can obviously lead to inefficiency in evaluation, not to mention infinite recursion. Unlike Prolog, where it is possible to have an unbounded sequence of distinct subgoals, Datalog queries with finite EDBs can lead only a finite number of different subgoals, up to renaming of variables. We can avoid redundant work and endless loops if we keep track of prior subgoals and their answers [Warren 1992]. We can then avoid a recursive call to a previously seen subgoal (or a more specific version of a previous subgoal). Furthermore, if we record previous answers to a given subgoal, then, if we encounter the subgoal again, we can look up the answers already found rather than computing them again. This strategy is an example of memoization, or tabling, which can be viewed as a variant of dynamic programming [Warren 1992].
Under tabled evaluation, every encountered subgoal is remembered in a table of subgoals, and every answer computed for a subgoal is remembered in the answerstable associated with this subgoal. No answer is added to the table of a subgoal more than once. Whenever a subgoal is encountered during evaluation, it is checked against the table to see if it has been previously encountered and is thus already evaluated or is in the process of being evaluated. If it is in the table, then it is not re-evaluated, but instead its answers from the table are returned as answers to the current subgoal invocation. It may happen that not all the answers for a subgoal (or even none of the answers) are in the table when it is subsequently encountered, so the computation must be able to suspend and resume later, when answers show up in the table for this subgoal.
When a goal is looked up in the table to see if it has been previously called, there are two options: one can look for a goal that is the same as the current goal up to variable renaming; this option is called variant tabling. Alternatively, one can look for a more general goal that subsumes the current goal.18 In this case, all the answers for the subsumed goal will be answers to the subsuming goal, so a subsumed goal can get its answers from the table of the subsuming goal. Using subsuming calls is known as subsumptive tabling. Our examples below assume variant tabling.
Consider again the definition of the influenced
relation, but with the instance of cited
shown below:
We want to pose a query to find all the colleagues that dm
influences. In Prolog, this program goes into an infinite loop, repeatedly generating dsw
and dm
, because Prolog’s evaluation strategy tries to enumerate all the paths in the graph and there are infinitely many of them.19
Under tabled evaluation, subgoal calls and their corresponding answers are remembered and re-used. Answers are returned only once and calls are made only once, with subsequent calls using the answers returned by the initial call. Consider the following trace of tabled evaluation of the query ?– influenced(dm, COLL).
1 influenced(dm, COLL)
The non-indented lines indicate additions to the table: either a new goal or a new answer to an existing goal. The posed goals are simply added to the table as is, and returned answers are prefixed by “ans:
”. In lines 1–5, influenced(dm, COLL)
is called, which in turn calls cited(COLL, dm)
and returns the answer dsw
; the new goal (on line 2) and its new answer (on line 5) are then added to the table. On line 10, evaluation of the second rule for influenced(dm, COLL)
begins, with computation continuing as in Prolog (except for the table-update side effects) to line 19. At line 19, another call to influenced(dm, COLL)
is posed. A variant of this goal (in this case, identical to this very goal) is found in the table and so influenced(dm, COLL)
is not actually called; instead the answers in the table for that goal are returned. Specifically, we see on line 20 that the answer of dsw
is returned from the table. This step leads to a new answer from the second clause for the goal influenced(dsw, COLL)
, with COLL = dsw
. This fact is added to the table on line 21. Then, backtracking to the first goal of that second clause returns tjg
as the other answer to the subgoal cited(_h795, dsw)
. This step leads to the new call to influenced(tjg, COLL)
in line 23, which gets added to the table, and the search through line 27 determines that tjg
cited no one. At this point the contents of the table is:
Now we can return answers from the table to suspended goals. Returning the answer dm
to influenced(dsw, COLL)
results in the new answer dm
to the goal influenced(dm, COLL)
, which gets added to the table in line 29. Similarly for tjg
in lines 30 and 31. Lines 32–35 show other answers being returned to calls in the table, but they generate only answers that are already in the table, so they are simply no-ops. At this point all answers have been returned to all goals, and the computation is complete. The table now contains all the answers to the posed goal (and all generated subgoals).
Consider now a different, logically equivalent, way of writing the influenced
rules:
No experienced Prolog programmer would ever try to write the transitive closure in this form, since the immediate recursion of the second clause would always result in an infinite loop. However, with tabling, such rules are easily handled and, in fact, this left-recursive form of transitive closure is more efficient!
Consider the following trace of a tabled evaluation of this left-recursive form for transitive closure with the same facts as before:
Here, again, lines 1–6 are the same as in Prolog, with the side effects that update the goal–answer tables. At line 7 we see the recursive call to influenced(dm,