How to write a theory in CVC4 (old instructions)

From CVC4
Jump to: navigation, search

This document is meant as a guide to implementing a new theory in CVC4.

NOTE these are old instructions

It's likely that the instructions below are partially incorrect. The new version is here.

CVC4 architecture

The following diagram shows a high-level description of the flow of information in CVC4.

TheoryInterface.jpg

The SmtEngine acts as the entry point into CVC4 and has references to the TheoryEngine and the PropEngine. The TheoryEngine is a proxy for all the CVC4 theory decision procedures and coordinates the calls to them. The PropEngine handles the interaction with the SatSolver (a wrapper class for MiniSat in the current implementation) and the CnfStream. When a new formula is asserted the formula is preprocessed before being sent to the PropEngine. This consists of staticLearning, calling the theory specific rewriter and then preRegistering it. When checkSat is called presolving is called for each theory before the SatSolver starts solving.

The most important information flow is that in the solving-check loop. As the SatSolver assigns values to then literals check is called for all the theories at various times in the search. It is important to make a distinction between at least two kinds of check based on the effort level: quick check and full effort check. A theory check with a quick check effort level will be called on all asserted clauses of size 1 and at various times during the search. It is a system-wide invariant that check with full effort will be called once before returning "sat". In this case, if the current assignment is unsatisfiable the theory check must either return a conflict or add a lemma.

When the SatSolver does propagation it does both boolean propagation and theory propagation. If the SatSolver reaches a conflict and it needs to backtrack, it might require an explanation for a theory propagated literal i.e. a conjunction of literals that have been assigned before the propagated literal that caused the propagation. This is done lazily through a call to theory propagate.

SmtEngine

  • preprocess - calls TheoryEngine staticLearning() and then TheoryEngine preprocess on the learned facts
  • check - calls PropEngine checkSat() which calls theoryEngine presolve() and then starts the SAT solver which in turn will call theoryCheck()

TheoryEngine

  • check, propagate - calls check/propagate on all registered theories
  • presolve - calls presolve for all theories; stops if there is a conflict
  • preprocess - first removes ITEs and then rewrites the node by calling the specific theory rewriter. Then theory preregister is being called on all the subterms after their children have all been registered. It is also called in the TheoryEngine when a lemma is added and when a formula has been asserted.

PropEngine

Adding a new theory

parser

todo

config and make

todo

types and kinds

todo

Theory API & Invariants

Each theory has to be a subclass of the Theory class defined in src/theory/theory.h. A theory has a Context dependent queue of Nodes (cannot be TNodes because some atoms such as equalities can be send across theories without being stored in a map) called d_facts that consists of the atoms asserted in the theory.

Methods implemented in the base class

The following methods are implemented in the base class:

  • Theory - the base class constructor must be overwritten by a constructor of this form TheoryX(context::Context* c, OutputChannel& out) .
  • get - returns the next atom the d_facts queue; theory registration is called in reverse topological order on all unregistered subterms of the atom
  • assertFact - adds a new term to the queue of facts
  • shutdown - called by the TheoryEngine right before destruction. It is important because of ordering issues between PropEngine and Theory. If you overload this method you must make an explicit call here to this->Theory::shutdown()

Methods implemented by the derived class

The following should be implemented by the derived classes:

  • preregisterTerm - is done only once for a node, called during preprocess in TheoryEngine. Preprocessing is done on all input formulas and on the new formulas resulting from adding new lemmas.
  • registerTerm - term registration is called when get() is called and a new fact is pulled of the facts queue. This should be done in a "context escape" mode, that is at context level 0 it must not modify conext-dependent objects it has not created itself.
  • check - checks the current assigment's consistency. The most important check() invariant is that when check is called with full effort and the current assignment is unsatisfiable it must either return a conflict or force the solver to split by adding a lemma. This is sound because the theory check will be called again with FULL_EFFORT; it is a system wide invariant that before returning "sat" all the theories' check() will be called with full effort. The returned conflict must consist only from atoms that have literals in the SAT solver. If check() is called with an effort level less than FULL_EFFORT it is left to the latitude of the theory implementer to do whatever they want. However not doing anything and only pulling facts off the queue when full effort check() is called is a bad idea, since we are in a leaf of the search and backtracking will undo all the work.
  • propagate - propagate new literal assignments in the current context. Note that it can only propagate theory atoms, or negations of atoms, that already have a literal in the SAT solver. An invariant of propagate() is that it should never propagate a literal that is already assigned to true or false. Should call the output channel d_out->propagate().
  • explain - return an explanation for the argument Node n, consisting of a conjunction of facts that lead to the literal being propagated. Because in CVC4 explanations are done lazily this is decoupled from propagate(). An invariant of explain is that the decision level of the atoms in the explanation has to be less than or equal than of n. If they are on the same decision level, special care must be taken to make sure the literals in the explanation have not been propagated after n t hus yielding a circular arguemnt. Report this explanation to the output channel's explain().
  • staticLearning - called by the SmtEngine during preprocess() via the TheoryEngine. This is not done to a fixpoint (i.e. re-statically-learning until no new facts are learned) but by iterating once through all the theories in the theory table. staticLearning() receives as a parameter a node consisting of all the facts in the input, before being converted to CNF. It must return in the AND-node-builder learned facts that must be true at decision level 0.
  • presolve - called exactly one time per user check-sat, after preregistration, rewriting and boolean propagation but the notified theory has not yet had its check() or propagate() methods called. A theory may empty its assertFact() queue using get(), it can raise conflicts, add lemmas and propagate literals during presolve.
  • addSharedTerm - TBA
  • notifyEq - called by the shared term manager when a shared term becomes equal to another shared term

Theory rewriting

TODO

Other things you need to know

Context dependent structures

Nodes and TNodes

Attributes

In CVC4 one can define custom attributes for nodes. The following shows how to define an attribute called MyAttribute of type bool:

  
  /** Tag for the MyAttribute flag on Nodes */ 
  struct MyAttributeTag {}; 
  /** The MyAttribute attribute */ 
  typedef CVC4::expr::Attribute<MyAttributeTag, bool> MyAttribute; 
  

One can also define a context dependent attribute as:

  
  /** Tag for the MyAttribute flag on Nodes */ 
  struct MyAttributeTag {}; 
  /** The MyAttribute context dependent attribute */ 
  typedef CVC4::expr::CDAttribute<MyAttributeTag, bool> MyAttribute; 
  

Statistics

The StatisticsRegistry keeps track of all the statistics and prints them out. There are various kinds of statistics including average, max and timers. Satistics must be registered with the statistics registry in the constructor of the class in which they are declared.

Here is an example of declaring some simple statistics:

 /** int stat */
  IntStat d_myIntStat;
 /** average stat */
 AverageStat d_myAvgStat;
 /** timer statistic */
 TimerStat d_myTimerStat;
 

They must be registered with the StatisticsRegistry in the constructor of the class they are declared in:

 /** in class constructor */
 MyClass(): 
     d_myIntStat("MyClass::myIntStat", 0),
     d_myAvgtStat("MyClass::myAvgStat", 0),
     d_myTimerStat("MyClass::myTimerStat") {
   StatisticsRegistry::registerStat(&d_myIntStat);
   StatisticsRegistry::registerStat(&d_myAvgStat);
   StatisticsRegistry::registerStat(&d_myTimerStat);
   }
 

To use the statistics:

 ++d_myIntStat; 
 d_myIntStat+=3;
 d_myIntStat.setData(42);
 
 d_myAvgStat.addEntry(34);
  
 

To use the TimerStat to measure the time spend in a function just add the following line at the beginning of the function:

 TimerStat::CodeTimer codeTimer(d_myTimerStat);
 

And finally they must be unregistered in the destructor:

 /** in the class destructor */
 ~MyClass() {
   StatisticsRegistry::unregisterStat(&d_myIntStat);
   StatisticsRegistry::unregisterStat(&d_myAvgStat);
   StatisticsRegistry::unregisterStat(&d_myTimerStat);
   }