How to write a theory in CVC4 (old instructions)
This document is meant as a guide to implementing a new theory in CVC4.
Contents
CVC4 architecture
The following diagram shows a high-level description of the flow of information in CVC4.
The SmtEngine acts as the entry point into CVC4 and has references to the TheoryEngine and the PropEngine. The TheoryEngine is a proxy for all the CVC4 theory decision procedures and coordinates the calls to them. The PropEngine handles the interaction with the SatSolver (a wrapper class for MiniSat in the current implementation) and the CnfStream. When a new formula is asserted the formula is preprocessed before being sent to the PropEngine. This consists of staticLearning, calling the theory specific rewriter and then preRegistering it. When checkSat is called presolving is called for each theory before the SatSolver starts solving.
The most important information flow is that in the solving-check loop. As the SatSolver assigns values to then literals check is called for all the theories at various times in the search. It is important to make a distinction between at least two kinds of check based on the effort level: quick check and full effort check. A theory check with a quick check effort level will be called on all asserted clauses of size 1 and at various times during the search. It is a system-wide invariant that check with full effort will be called once before returning "sat". In this case, if the current assignment is unsatisfiable the theory check must either return a conflict or add a lemma.
When the SatSolver does propagation it does both boolean propagation and theory propagation. If the SatSolver reaches a conflict and it needs to backtrack, it might require an explanation for a theory propagated literal i.e. a conjunction of literals that have been assigned before the propagated literal that caused the propagation. This is done lazily through a call to theory propagate.
SmtEngine
- preprocess - calls TheoryEngine staticLearning() and then TheoryEngine preprocess on the learned facts
- check - calls PropEngine checkSat() which calls theoryEngine presolve() and then starts the SAT solver which in turn will call theoryCheck()
TheoryEngine
- check, propagate - calls check/propagate on all registered theories
- presolve - calls presolve for all theories; stops if there is a conflict
- preprocess - first removes ITEs and then rewrites the node by calling the specific theory rewriter. Then theory preregister is being called on all the subterms after their children have all been registered. It is also called in the TheoryEngine when a lemma is added and when a formula has been asserted.
PropEngine
Adding a new theory
parser
todo
config and make
todo
types and kinds
todo
Theory API & Invariants
Each theory has to be a subclass of the Theory class defined in src/theory/theory.h. A theory has a Context dependent queue of Nodes (cannot be TNodes because some atoms such as equalities can be send across theories without being stored in a map) called d_facts
that consists of the atoms asserted in the theory.
Methods implemented in the base class
The following methods are implemented in the base class:
- Theory - the base class constructor must be overwritten by a constructor of this form
TheoryX(context::Context* c, OutputChannel& out)
.
- get - returns the next atom the d_facts queue; theory registration is called in reverse topological order on all unregistered subterms of the atom
- assertFact - adds a new term to the queue of facts
- shutdown - called by the TheoryEngine right before destruction. It is important because of ordering issues between PropEngine and Theory. If you overload this method you must make an explicit call here to
this->Theory::shutdown()
Methods implemented by the derived class
The following should be implemented by the derived classes:
- preregisterTerm - is done only once for a node, called during preprocess in TheoryEngine. Preprocessing is done on all input formulas and on the new formulas resulting from adding new lemmas.
- registerTerm - term registration is called when get() is called and a new fact is pulled of the facts queue. This should be done in a "context escape" mode, that is at context level 0 it must not modify conext-dependent objects it has not created itself.
- check - checks the current assigment's consistency. The most important check() invariant is that when check is called with full effort and the current assignment is unsatisfiable it must either return a conflict or force the solver to split by adding a lemma. This is sound because the theory check will be called again with
FULL_EFFORT
; it is a system wide invariant that before returning "sat" all the theories' check() will be called with full effort. The returned conflict must consist only from atoms that have literals in the SAT solver. If check() is called with an effort level less thanFULL_EFFORT
it is left to the latitude of the theory implementer to do whatever they want. However not doing anything and only pulling facts off the queue when full effort check() is called is a bad idea, since we are in a leaf of the search and backtracking will undo all the work.
- propagate - propagate new literal assignments in the current context. Note that it can only propagate theory atoms, or negations of atoms, that already have a literal in the SAT solver. An invariant of propagate() is that it should never propagate a literal that is already assigned to true or false. Should call the output channel
d_out->propagate()
.
- explain - return an explanation for the argument Node n, consisting of a conjunction of facts that lead to the literal being propagated. Because in CVC4 explanations are done lazily this is decoupled from propagate(). An invariant of explain is that the decision level of the atoms in the explanation has to be less than or equal than of n. If they are on the same decision level, special care must be taken to make sure the literals in the explanation have not been propagated after n t hus yielding a circular arguemnt. Report this explanation to the output channel's explain().
- staticLearning - called by the SmtEngine during preprocess() via the TheoryEngine. This is not done to a fixpoint (i.e. re-statically-learning until no new facts are learned) but by iterating once through all the theories in the theory table. staticLearning() receives as a parameter a node consisting of all the facts in the input, before being converted to CNF. It must return in the AND-node-builder learned facts that must be true at decision level 0.
- presolve - called exactly one time per user check-sat, after preregistration, rewriting and boolean propagation but the notified theory has not yet had its check() or propagate() methods called. A theory may empty its assertFact() queue using get(), it can raise conflicts, add lemmas and propagate literals during presolve.
- addSharedTerm - TBA
- notifyEq - called by the shared term manager when a shared term becomes equal to another shared term
Theory rewriting
TODO
Other things you need to know
Context dependent structures
Nodes and TNodes
Attributes
In CVC4 one can define custom attributes for nodes. The following shows how to define an attribute called MyAttribute of type bool:
/** Tag for the MyAttribute flag on Nodes */ struct MyAttributeTag {}; /** The MyAttribute attribute */ typedef CVC4::expr::Attribute<MyAttributeTag, bool> MyAttribute;
[mdeters@adric ~]$ cd cvc4 [mdeters@adric cvc4]$ svn info Path: . URL: https://subversive.cims.nyu.edu/cvc4/cvc4/trunk Repository Root: https://subversive.cims.nyu.edu/cvc4 Repository UUID: 615e1583-3794-4256-8e52-04c157d34929 Revision: 59 Node Kind: directory Schedule: normal Last Changed Author: barrett Last Changed Rev: 59 Last Changed Date: 2009-12-15 13:20:31 -0500 (Tue, 15 Dec 2009) [mdeters@adric cvc4]$