The two objectives of this paper are 1) to give a large and varied problem set, complete clause sets for use in testing automated theorem-proving programs and 2) the presentation of a number of experiments with an existing program under a variety of conditions.
The problems range from quite trivial to quite difficult, including some for which a refutation was not obtained. The statements of a problem without the full list of corresponding clauses would be of only cursory value when viewed from either of the objectives of this paper. The problems are taken from the usually studied algebraic systems, Tarskian geometry, Henkin models, and from Wang's suggested test cases, among others. Among those included herein, the following are of added interest because of their difficulty, especially from the viewpoint of automated theorem proving.
1) In group theory x3 = e implies the cummutator ((x,y),y) = e. The proof is rather longer than that of the usually cited test problems for theorem-proving programs. The problem is even somewhat challenging for the individual unaided by the computer.
2) Using the Tarski axioms for plane geometry, the relation of betweenness can be shown to be symmetric. This axiom set is quite nonintuitive. The problems taken from this field provide an interesting test for programs because, among other considerations, they use six place functions in the corresponding clause sets.
The experiments compare the inference rules of UR-resolution, hyperresolution, paramodulation, taken separately, and in combination. Presented herein are examples employing the set of support strategy, various complexity schemes (C-sets), inclusion of lemmas, various demodulation approaches, and various clause representations.
The program was developed at Northern Illinois University and the experiments are from its use at the Argonne National Laboratory on an IBM 370/195. Time for the experiments, number of clauses generated, number of clauses retained, percentage of successful unifications, and some of the parameters governing each experiment are given together with the actual set of clauses employed therein.