Developer's Guide

From CVC4
Revision as of 10:19, 24 October 2012 by Taking (Talk | contribs) (Attributes)

Jump to: navigation, search

In addition to this wiki, we also have a developer mailing list and Bugzilla database for this version of CVC. CVC3 resources have been kept separate, at this time, because of the different needs of CVC4 during active development.

Contents

Source tree layout

  • config - this directory holds m4 macro processor files (autoconf code) as well as bits of the autotools build system that go out as part of the distribution to smooth over platform differences.
  • contrib - this directory includes maintainer scripts and other things that aren't directly part of CVC4
  • doc - documentation
  • src/context - context manager, context-dependent object/collection sources
  • src/expr - the expression package
  • src/include - systemwide includes
  • src/lib - replacement functions needed on some platforms (which ones are needed is determined by configure)
  • src/main - source for the main cvc4 binary
  • src/parser - parsers for supported CVC4 input formats
  • src/prop - propositional parts of CVC4; these include imported minisat sources and the PropEngine
  • src/smt - SmtEngine
  • src/theory - the TheoryEngine and all theory solvers
  • src/util - utility classes, debugging macros, the Exception class, etc.
  • test - CxxTest unit tests (invoked by make check)

Building CVC4

Dependencies

The following tools and libraries are required to build CVC4 (versions given are minimums; more recent releases are probably OK):

All of the above packages are available in recent releases of Fedora and in Debian 6.0 (squeeze). All current releases of Ubuntu have out-of-date packages for ANTLR3 and libantlr3c; this should be fixed in Ubuntu 11.04 (natty). Red Hat Enterprise Linux 5 has out-of-date autotools and ANTLR packages; the autotools packages should be fixed in RHEL 6 (don't know about ANTLR3 packages yet).

Building

If all of the dependencies are installed, you should be able to build CVC4 (from a subversion checkout) by running

   ./autogen.sh
   ./configure --with-antlr-dir=$LIBANTLR3_INSTALL_DIR ANTLR=$ANTLR3_DIR/antlr3"
   make

, where $LIBANTLR3_INSTALL_DIR and $ANTLR3_DIR/antlr3 have been explained in ANTLR3. The first step is not necessary if you have a distribution tarball (such as a release or a nightly build), as the configure scripts have already been generated in that case. In the rest of this developer's guide below, it's generally assumed you're working with a Subversion checkout rather than a distributed source tarball. Sometimes (as when updating the Makefile.am files, or adding source directories, you'll need to run "autoreconf" instead of "autogen.sh" if you're working from a source tarball.

Building on Mac OS X

Building CVC4 on Mac OS X is exactly the same as building on any Linux machine. However, there are some gotchas:

  • The file system typically used on Mac OS X does not distinguish between uppercase and lowercase file names; this can sometimes cause problems.
  • The CVC4 build and test scripts may not function well in directory paths with embedded spaces; if this happens, it is a bug, but as a workaround, you can put the CVC4 sources in a directory path without spaces.
  • An OS X specific frustration is that by default CVC4's build system has libtool generate a wrapper executable for ./builds/bin/cvc4 (For more information see [1]). This can make it quite hard to debug. To get around this, you can pass "--enable-static-binary" at configure time.
  • You may need to install several packages via the MacPorts project (e.g., gmp); these are installed under /opt/local, which means you'll have to tell CVC4's configure script how to find the #includes and libraries (see below)

On Mac OS X, several directory paths are nonstandard. For starters, gmp, swig and cln are installed (via MacPorts) in /opt/local. Also, the Java development files on Mac OS are installed in a strange place. You may end up with a configure line like the following:

./configure 
    --prefix=/Users/Wei/Workspace/target/cvc4/ \ 
    CPPFLAGS=-I/opt/local/include \ 
    LDFLAGS=-L/opt/local/lib \
    JAVA_CPPFLAGS=-I/System/Library/Frameworks/JavaVM.framework/Headers \ 
    SWIG=/Users/Wei/Workspace/target/swig/bin/swig \ 
    --with-antlr-dir=/Users/Wei/Workspace/target/antlr3c \ 
    ANTLR=/Users/Wei/Workspace/target/antlr/antlr3 \
    --enable-language-bindings=java \
    --enable-static-binary

If you find yourself building CVC4 a lot, of course, you can set these variables in your environment; configure will get them from there, so you don't have to enter them each time.

Building language bindings

The language bindings (for Java, python, perl, ocaml, etc...) each have their own requirements, and the configure script tries to figure out if you have (and if it can find) what's necessary. Often, the configure script needs some help finding appropriate headers. Each language has it's own CPPFLAGS and LDFLAGS variables you can set; they are used only when building and linking that language's bindings (e.g., JAVA_CPPFLAGS, JAVA_LDFLAGS, PYTHON_CPPFLAGS, etc.).

Swig 2.0 is required for building the language bindings, and the latest version (2.0.4 at the time of writing) is strongly preferred.

Using CVC4's Subversion source control repository

CVC4 keeps a source control repository using Subversion, allowing us to track changes, effectively branch and merge, etc.

Accessing CVC4's repository

Access to CVC4's Subversion repository is restricted; you must have an AcSys-group CIMS account, or apply to us for outside collaborator access to the repository. Once you have access, you can:

 svn --username $USERNAME checkout https://subversive.cims.nyu.edu/cvc4/cvc4/trunk cvc4

...where $USERNAME is your CIMS username (or the username you were given as an outside collaborator). If your username on the Windows/UNIX machine you're coming from is the same, you can leave out the --username $USERNAME part.

The above checks out /cvc4/cvc4/trunk to your local directory cvc4. This second argument (your local working directory for CVC4 development), you can of course call whatever you like. The first is the path in the repository. The initial /cvc4 is the base of our repository (other groups' repositories are hosted on the same server). The second /cvc4 in the path is the product (since we might decide to keep other, CVC4-related projects in the CVC4 repository even if they aren't directly part of CVC4. This includes things like useful benchmark suites (that aren't part of the source tree proper), benchmark-processing utilities, the nightly build system scripts, the webpage, etc. These things may not belong directly in the source tree, but are still worth keeping in the CVC4 repository. The third component of the path, /trunk, is the branch of CVC4 you're requesting. In Subversion terminology, trunk is the head of development: the most recent version of CVC4 that's been committed. The trunk is where most of our CVC4 mainline development will take place, for now. We may occasionally branch (see below) to develop and test new or experimental features, but generally these branches will be merged back into the trunk.

Once you've checked out a working directory, some helpful commands are:

  • svn update - updates your working directory to the most recent version, merging updates into your edits)
  • svn status - tells you what you've changed
  • svn diff - gives you a diff of what you've changed

Repository access from Eclipse

From Eclipse, you can go to File -> New -> Project... and then choose Project from SVN under the SVN category. Use the repository location https://subversive.cims.nyu.edu/cvc4/cvc4/trunk and click Finish. Your Eclipse CVC4 project is then associated to the repository and you can use the Subversion facilities of Eclipse to perform the operations in this section instead of the command line.

Browsing the repository

You can also browse CVC4 sources without checking them out from Subversion. Simply point your browser to:

 https://subversive.cims.nyu.edu/cvc4/cvc4/trunk

Your browser may complain about the security certificate. Use your CIMS login and password, or the outside collaborator credentials we've given you.

Branching and merging

Branches live in the repository under /cvc4/cvc4/branches. You can see a list of all branches available (that haven't been deleted) by using svn ls:

 svn --username $USERNAME ls https://subversive.cims.nyu.edu/cvc4/cvc4/branches

To check out a branch, you check out the branch path instead of the trunk:

 svn --username $USERNAME checkout https://subversive.cims.nyu.edu/cvc4/cvc4/branches/experimental-quantifier-instantiation cvc4.quant

To get information about where a working directory came from, use svn info

 [mdeters@adric ~]$ cd cvc4
 [mdeters@adric cvc4]$ svn info
 Path: .
 URL: https://subversive.cims.nyu.edu/cvc4/cvc4/trunk
 Repository Root: https://subversive.cims.nyu.edu/cvc4
 Repository UUID: 615e1583-3794-4256-8e52-04c157d34929
 Revision: 59
 Node Kind: directory
 Schedule: normal
 Last Changed Author: barrett
 Last Changed Rev: 59
 Last Changed Date: 2009-12-15 13:20:31 -0500 (Tue, 15 Dec 2009)
 
 [mdeters@adric cvc4]$
 

Creating a branch

To create a branch, you copy the source tree to another place:

 svn --username $USERNAME cp https://subversive.cims.nyu.edu/cvc4/cvc4/trunk https://subversive.cims.nyu.edu/cvc4/cvc4/branches/my-branch

...though please use a more descriptive name for your branch than this! Then proceed by checking out the branch:

 svn --username $USERNAME checkout https://subversive.cims.nyu.edu/cvc4/cvc4/branches/my-branch cvc4.mybranch

Before doing lots of development on it, please document why your branch exists in a file named README.<branchname> in the top-level directory:

 [mdeters@adric ~]$ cd cvc4
 [mdeters@adric cvc4]$ vi README.my-branch
 [mdeters@adric cvc4]$ cat README.my-branch
 This branch is to test the ideas of Splitting On Demand in SMT:
 
   ftp://ftp.cs.uiowa.edu/pub/tinelli/papers/BarNOT-LPAR-06.pdf
 
 -- Morgan Deters <mdeters@cs.nyu.edu>  Tue, 15 Dec 2009 17:31:01 -0500
 [mdeters@adric cvc4]$ svn add README.my-branch
 [mdeters@adric cvc4]$ svn commit
 

This makes it easy to see what each branch exists for (without even checking it out):

 svn --username $USERNAME cat https://subversive.cims.nyu.edu/cvc4/cvc4/branches/my-branch/README.my-branch
 This branch is to test the ideas of Splitting On Demand in SMT:
 
   ftp://ftp.cs.uiowa.edu/pub/tinelli/papers/BarNOT-LPAR-06.pdf
 
 -- Morgan Deters <mdeters@cs.nyu.edu>  Tue, 15 Dec 2009 17:31:01 -0500
 

Of course, the above example created a branch from the trunk. That need not be the case; you might create a branch from another branch (simply copy that branch instead of the trunk).

Merging a branch back to the trunk

Recent versions of Subversion make merging incredibly easy, because for versions of CVC4 with a common ancestral line (which you get from branching) have merge metadata stored regarding their merges. This makes it much, MUCH easier to merge than it used to be under CVS. Let's say you've created a branch my-branch from the CVC4 trunk, like above. First, commit all your changes to my-branch that you want. Then, get a clean copy of the CVC4 trunk:

 [mdeters@adric ~]$ svn co https://subversive.cims.nyu.edu/cvc4/cvc4/trunk cvc4.clean
 [mdeters@adric ~]$ cd cvc4.clean
 [mdeters@adric cvc4.clean]$ svn merge https://subversive.cims.nyu.edu/cvc4/cvc4/branches/my-branch
 

That's all there is to it. Subversion will merge all of the differences between trunk and my-branch since the last merge into the working directory. That's the key part: since the last merge. It keeps track of this information for you. You can inspect the changes, do a test build, and when you're happy with the changes, commit them.

You might occasionally merge the other way, too. If you've been developing along a branch for a long time, you might want to get the newest bugfixes from other parts of CVC4 that have been committed to the trunk. You can merge changes in the trunk to my-branch as well:

 [mdeters@adric ~]$ cd cvc4.my-branch
 [mdeters@adric cvc4.my-branch]$ svn merge https://subversive.cims.nyu.edu/cvc4/cvc4/trunk
 

Naturally, if you don't want to incorporate changes into/from the trunk, but rather another version, then specify that URL instead.

Symbolically tagging a version

Symbolically tagging a branch (for example, a release version) is very similar to branching, but put under /tags instead of /branches. For example:

 svn --username $USERNAME cp https://subversive.cims.nyu.edu/cvc4/cvc4/trunk https://subversive.cims.nyu.edu/cvc4/cvc4/tags/cade2010

Just as with branching, please provide a README file:

 
  [mdeters@adric ~]$ svn --username $USERNAME checkout https://subversive.cims.nyu.edu/cvc4/cvc4/tags/cade2010 cvc4.cade2010
  [mdeters@adric ~]$ cd cvc4.cade2010
  [mdeters@adric cvc4.cade2010]$ vi README.cade2010
  [mdeters@adric cvc4.cade2010]$ cat README.cade2010
  cade2010 version of CVC4
  ------------------------
  This version generated the results published in the CADE 2010 paper.
  
  -- Morgan Deters <mdeters@cs.nyu.edu>  Tue, 15 Dec 2009 17:39:05 -0500
  [mdeters@adric cvc4.cade2010]$ svn add README.cade2010
  [mdeters@adric cvc4.cade2010]$ svn commit
  

Branches versus tags

Use a branch when you plan to do major (and perhaps experimental) editing on the source tree and don't want to screw up the nightly builds, tests, or other developers with a work in progress. Use a tag when no development will occur on the source tree; it's merely a "bookmark" on a historical version of CVC4 in case we need to refer to it later in the repository.

Working with files and directories

Occasionally a class name might change (and its source file will therefore need renaming), or you want to move around files and directories in other ways. In the old days under the CVS version control system, you had to remove the files under the old name/location and add them anew under the new name/location, losing all of the history information. Not so under Subversion. You can mv files and directories just like under UNIX:

 svn mv foo.cpp bar.cpp

You can mkdir:

 svn mkdir newdir

If you already have the directory (and sources in it), you can svn add it:

 svn add newdir

The same goes with new source files:

 svn add new_source_file.cpp

And, also, if you want to remove a file from the repository, you can do so:

 svn rm obsolete_test.h

Note that these commands operate on the working directory. This has two consequences:

  1. You don't need to use --username $USERNAME, because the working directory is already linked to your login credentials (it was stored in the working directory meta-information when you checked out the working copy).
  2. The changes aren't immediate. You still need to use svn commit to commit your changes (and leave a log in the commit history).

Reverting a file in a working directory

Sometimes you make a local edit to a file and decide you want to undo the edit (restoring the last version you retrieved from the repository via a checkout or update). To do that, use svn revert:

 svn revert file.cpp

Reverting a commit

Sometimes you want to revert a commit previously done by you or someone else. Perhaps a bugfix actually caused more problems than it solved, or you want to back out of a refactoring decision that turned out to be a bad idea. Let's say that revision 120 was perfect but a bugfix committed as revision 121 broke something and you want to undo that change. Now the repository is at revision 130. No problem:

[mdeters@adric ~]$ svn co https://subversive.cims.nyu.edu/cvc4/cvc4/trunk cvc4.clean
[mdeters@adric ~]$ cd cvc4.clean
[mdeters@adric cvc4.clean]$ svn merge . -r 121:120

Merge here means to merge the differences between two revisions; it's similar in some respects to the notion of merge between branches, but here there is no branch. Subversion computes a diff between revisions 121 and 120 (here, order is important---we're doing a rollback so the revision number goes down) and merges it into the working directory. The "." is the directory to be merged. Perhaps revision 121 committed a lot of things, and you only want to rollback the changes made to the context manager. No problem:

[mdeters@adric cvc4.clean]$ svn merge src/context -r 121:120

Ignoring files

If you're familiar with CVS, you're probably familiar with .cvsignore files---these are committed to the repository but are special in that they instruct CVS to ignore certain files in the working directory (and not report them as ? when you update). In Subversion, an svn status is similar; it will complain about files that exist but aren't in the repository. You can tell Subversion to ignore some files if they shouldn't ever be committed to the repository by using the property system (specifically the svn:ignore property on directories). "svn proplist" lists the properties on a file/directory, "svn propget" and "svn propset" get and set a specific property value on a file or directory, "svn propdel" deletes a property, and "svn propedit" edits a property (using the current $EDITOR from your environment). For example:

[mdeters@adric ~]$ cd cvc4
[mdeters@adric cvc4]$ EDITOR=vi svn propedit svn:ignore .

...edits the svn:ignore property on the top-level cvc4 source directory with the vi editor.

Changes to properties are versioned and tracked as well; changes must be committed just like a file.

Marking a file as executable in Subversion

Another property is used for the eXecute bit. If you want to make a file executable, you can set the svn:executable property:

 svn propset svn:executable yes filename

Resolving conflicts in a working directory

Sometimes you do an svn update and the update procedure causes a conflict (because you made changes in the same place as someone else did). In such cases, Subversion punts: it doesn't know which change should override the other, or if they need to be combined in some (non-naïve) way.

Open the file with the conflict and edit it to remove the conflicts. If Subversion still thinks it's in conflict (it has a C mark when you run svn status), then you should inform it that the conflict has been resolved:

svn resolved conflicting_file.cpp

For more information

Refer to the Subversion book for more information on Subversion and how to work with repositories.

The CVC4 build system

The build system behind CVC4 is pretty complex. It attempts to:

  1. be cross platform (automake, libtool, autoconf are used)
  2. support a set of standard build profiles (default, production, debug, competition) with standard settings for assertions, debugging symbols, etc.
  3. allow deviating from the standard build profile by overriding its settings
  4. keep separate object/library files and sources (doing ./configure from the source directory creates a build directory and configures it instead)
  5. support multiple builds at once in same source tree (build directory names are based on build profile, e.g., debug or production, and architecture)
  6. support changing Makefile.am automake specifications without re-running a bootstrap script
  7. support partial tree rebuilding from the source directories (Makefile in source directory delegates to "current" build directory)
  8. support partial tree rebuilding from the build directories (recursive make)

Due to these constraints, the build system of CVC4 resembles a recursive-make implementation using automake/autoconf/libtool, except that it's very nonstandard.

Build profiles

The build profiles supported are:

  • production - highly optimized, no assertions, no tracing
  • debug - unoptimized, debug symbols, assertions, tracing
  • default - moderately optimized, assertions, tracing
  • competition - maximally optimized, no assertions, no tracing, muzzled

The default build profile is, not surprisingly, default. That's what you'll get if you just run "./configure". To build a debug build, use:

 $ ./configure debug

and then "make" as usual.

Working with multiple, concurrent build profiles

You can configure multiple types of builds. They are built in separate directories. "make" will always build the last-configured one. If you want to build another (already configured) build, you can use "make CURRENT_BUILD=debug", but this is not "sticky"---the build profile built by default remains the last-configured one. For example:

 $ ./configure debug
 [a debug build is configured under builds/[ARCH]/debug/]
 $ make
 [builds the debug build, puts the binaries under builds/bin/]
 $ ./configure production
 [a production build is configured under builds/[ARCH]/production/]
 $ make
 [builds the production build, puts the binaries under builds/bin]
 $ make CURRENT_BUILD=[ARCH]/debug
 [builds the debug build, puts the binaries under builds/bin]
 $ make
 [builds the production build---it was the last-configured one]

Build profiles can be customized. For example, to make a debug build with support for profiling but without assertions, use:

 $ ./configure debug --enable-profiling --disable-assertions

Note that when you override build profile options in this way, the build is considered distinct. Here, "debug-with-profiling-but-without-assertions" is distinct from a "debug" build:

 $ ./configure debug --enable-profiling --disable-assertions
 [a debug build with profiling and without assertions is configured under builds/[ARCH]/debug-profiling-noassertions/]
 $ make
 [builds the debug-noassertions-profiling build, puts the binaries under builds/bin/]
 $ ./configure debug
 [a debug build (with assertions, no profiling code) is configured under builds/[ARCH]/debug/]
 $ make
 [builds the debug build, puts the binaries under builds/bin]
 $ make CURRENT_BUILD=[ARCH]/debug-noassertions-profiling
 [builds the debug-noassertions-profiling build, puts the binaries under builds/bin]

Determining the type of build from a binary

Sometimes you're left with a cvc4 binary and have no idea what produced it. Use the "--show-config" command-line option to inspect it:

$ builds/bin/cvc4 --show-config
This is a pre-release of CVC4.
Copyright (C) 2009, 2010
  The ACSys Group
  Courant Institute of Mathematical Sciences,
  New York University
  New York, NY  10012  USA

version   : prerelease

debug code: yes
tracing   : yes
muzzled   : no
assertions: yes
coverage  : no
profiling : no

To see if the binary is statically linked, or not, use the ldd command:

$ ldd builds/bin/cvc4
        linux-gate.so.1 =>  (0x006b0000)
        libcvc4parser.so.0 => /home/mdeters/cvc4-cleancheckout/builds/usr/local/lib/libcvc4parser.so.0 (0x00e6d000)
        libcvc4.so.0 => /home/mdeters/cvc4-cleancheckout/builds/usr/local/lib/libcvc4.so.0 (0x00a75000)
        libgmp.so.3 => /usr/lib/libgmp.so.3 (0x00110000)
        libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x008d8000)
        libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0x00a11000)
        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00298000)
        libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0x002b6000)
        /lib/ld-linux.so.2 (0x00d91000)

This was a dynamically-linked binary, and is dynamically-linked against the cvc4 libraries. If the cvc4 libraries are missing from the list, the binary was statically-linked against the cvc4 libraries (and dynamically-linked against the others). If ldd's output looks like this:

$ ldd /bin/bash-static
        not a dynamic executable

then the cvc4 binary was configured with --enable-static-binary.

Adding source and header files

To add a source file to CVC4, all you need to do is create the file, add the content, then:

$ svn add source_file.cpp

to add it to the repository, and edit the appropriate Makefile.am (usually in that directory, or if not then in the immediate parent directory) to list the file under *_SOURCES for the correct convenience library. When you next build the project, the Makefile.am's new timestamp should trigger make to regenerate the Makefile.in (in the source directory) and the Makefile (in the build directory), remake dependencies for the source, and rebuild what's necessary.

To add a header file to CVC4, create the file, add the content, then:

$ svn add header_file.h

to add it to the repository, and edit the appropriate Makefile.am to list the file under *_SOURCES for the correct convenience library. All the above should still apply.

Sometimes you need to add something to EXTRA_DIST rather than *_SOURCES (for example, if something is conditionally built, depending on the configure script, or a README or some such).

If you have problems getting the Makefile.in/Makefile/dependencies rebuilt, or get a warning from the missing script, re-run autogen.sh in the top-level source directory, re-run configure, and then proceed.

Adding source directories

If you add a source directory to CVC4 (under src/ or test/), make sure it's pristine (that is, just as you want it in the repository), and then:

  1. Add it to SVN:
    $ svn add newdir
  2. Make sure that the Makefile.am in the parent directory lists it in SUBDIRS in the correct order (e.g., if it depends on the build artifacts of another subdirectory, it should be listed after).
  3. Run contrib/addsourcedir path to auto-generate a basic Makefile and Makefile.am. You'll need to edit the Makefile.am, but this gives you a template to start with. Existing files are not overwritten. Alternatively, you can generate these two Makefiles yourself:
    1. Create Makefile.am in the new directory (using as a template another Makefile.am).
    2. Set up Makefile.am to build a convenience libtool library (noinst_LTLIBRARIES) with an appropriate lib*_la_SOURCES listing the headers and sources in the directory.
    3. Add the new convenience library to the *_LIBADD in the parent directory.
    4. Create a Makefile in the new directory (using other source Makefiles as a template).
  4. Add the new Makefiles to SVN.
  5. You may need to re-run autogen.sh in the top-level source directory.

The Makefile's sole purpose is to support typing "make" inside a source subdirectory. It needs to set up a few variables correctly and then include Makefile.subdirs from the top level. It looks like this: (but use a template from another source directory, it might be more up-to-date than this version)

topdir = ../..
srcdir = src/expr
 
include $(topdir)/Makefile.subdir

First, topdir is the relative path to the root of the source tree. srcdir is the relative path from the root to this directory. builddir should have the same definition for every subdirectory; it gives the full relative path to the associated build directory. The include line gets the standard rules for running make from a source subdirectory.

Note also that when you add a convenience library you need to add it also to the unit testing makefile in test/unit/Makefile.am. This is because the testing framework links directly against the convenience libraries to avoid the symbols that are hidden by the linker.

Adding copyrighted sources

The procedure for adding sources from other projects is a bit trickier. CVC4 incorporates source from others in a few different places:

  • autogen.sh, our auto-build-system-generation script
  • doxygen, boost, and TLS macros for autoconf (under config/)
  • MiniSat (under src/prop/minisat)

Naturally, we have other dependencies too, like on CxxTest and on libgmp, but for these we don't need to copy source under our tree; we are simply a client of those tools.

Generally, the steps you should perform when you choose to copy copyrighted sources into CVC4's source tree are:

  1. Does this make sense? Incorporating source from others can be a headache. We have to exclude these sources from our copyright, and include their copyright notice in ours.
  2. Identify legal implications. Is the license used by this software compatible with CVC4? Are we permitted to copy their sources? Are we permitted to modify them in any way? Are we permitted to re-distribute them?
  3. Create a vendor branch in the Subversion repository. Before copying into the CVC4 tree, create a vendor version of the software you're about to import:
    • Import a pristine version into the repository under https://subversive.cims.nyu.edu/cvc4/vendor/$PACKAGE/$VERSION_NUMBER.
    • Copy to "current":
      svn copy https://subversive.cims.nyu.edu/cvc4/vendor/$PACKAGE/$VERSION_NUMBER https://subversive.cims.nyu.edu/cvc4/vendor/$PACKAGE/current
    • Now copy into the cvc4 source tree. You can "copy" into your current working directory with svn checkout, for example:
      svn checkout https://subversive.cims.nyu.edu/cvc4/vendor/$PACKAGE/current src/util/bar
      will copy your package under CVC4's src/util directory as "bar." It is important that you use svn to do this, so there's a record of the link between the vendor branch and the "bar" directory.
  4. Importing another build system. If you're importing sources, you have to decide how to cleanly incorporate the imported project with CVC4. In most cases, for small imported projects, it's easiest to create a Makefile.am in their top-level that is just like our Makefile.am files, and compiles their entire project. But you may need to rely on their configure/make build system if the imported project is more complex, which means that in the future we may need to create configurable subprojects (though it'd be nice to avoid that, if possible). Use your best judgment.
  5. Adding a copyright exemption. You MUST edit the COPYING file at the top-level source directory for CVC4. I am not a lawyer, however, I would say that this is crucial. In COPYING, indicate:
    • Where the copied sources are in the CVC4 tree.
    • What its copyright and (lack of) warranty is. Include all its legal information.

    There are a lot of examples in COPYING already of this; use them as a template.

  6. Exclude from auto-copyright. If you have imported headers or source files into the CVC4 tree, you must edit the contrib/update-copyright.pl script to make sure it doesn't automatically add CVC4's copyright to the top of the file the next time it's run. Add the full (relative) path to it under $excluded_paths ($excluded_directories are names excluded at any level of the directory tree). If you don't know Perl-compatible regular expressions, ask someone.

Checking that your Makefiles are correct

This part of the Automake manual might be useful to people. make distcheck builds CVC4 and checks it ("make check"), then does a "make install" and "make uninstall" to make sure they are working, then builds a tarball ("make dist"), then builds that tarball with configure and make, runs the test suite ("make check") on that, then does another "make dist" from the dist to make sure the distribution can itself make a distribution. Then, it does a "make distclean" and makes sure nothing remains (distclean is supposed to clear out everything from a build, leaving you with a pristine distribution).

Profiling the code

prof, gprof, valgrind, callgrind

To profile CVC4, configure with --enable-profiling. Unfortunately, GProf can't instrument system calls, so it will give misleading results for allocation-heavy programs like CVC4. Instead, you can use the Callgrind tool (a component of Valgrind):

valgrind --tool=callgrind --dump-every-bb=100000000 <executable> <arguments...>

(The --dump-every-bb argument causes Callgrind to produce incremental data periodically throughout the execution. A value of 100,000,000 or higher is recommended to keep the number of data files emitted manageable.) Callgrind will create a master file callgrind.out.<pid> and a number of incremental data files (callgrind.out.<pid>.1, callgrind.out.<pid>.2, etc). Open callgrind.out.<pid> using the KCachegrind GUI tool to visualize the profiling information. (You can do this before the program has terminated. Pressing F5 will cause KCachegrind to load any new incremental data.)

Google performance tools

Recently, Chris Conway recommended Google's performance tools for profiling, and as of revision 1454 (March 14, 2011), Dejan Jovanović added support for it. Here are Dejan's revision notes:

adding support for google performance tools to the build sytem, it can be enabled at configure with
--with-google-perftools

to use it on ubuntu, you need to install packages google-perftools and libgoogle-perftools0

to run the profiling of the heap, you can run it for example with

HEAPPROFILE=/tmp/profile ./builds/bin/cvc4 test/regress/regress0/lemmas/clocksynchro_5clocks.main_invar.base.smt

this will create some files  /tmp/profile* that you can then

to get the pdf of the profile you can then run

google-pprof --pdf ./builds/bin/cvc4 /tmp/profile.0001.heap > profile.pdf

or for other options check http://goog-perftools.sourceforge.net/doc/


If you are having problems with cln or gmp segfaulting, try setting LD_PRELOAD to libtcmalloc. For example:

 LD_PRELOAD="/usr/lib/libtcmalloc.so" CPUPROFILE=profile ./builds/bin/cvc4 foo.smt

Coding guidelines

The following sections attempt to standardize C++ coding and organizational style for the CVC4 project.

Note that many of these standards apply to unit tests under test/ as well, but many do not.

The guiding philosophy in the coding guidelines for CVC4 is to make the code readable and consistent throughout without getting in the way by being overly restrictive. Flexibility is afforded in edge cases where the developer should make his or her best effort to satisfy the guidelines, but not at the expense of making the code less readable or formatted poorly or insensibly.

File and directory names

  • File names are all lowercase. Class FooSolver has its definition in foo_solver.h and its implementation in foo_solver.cpp.
  • File extensions .cpp and .h are generally preferred, with .ypp and .lpp for yacc and lex inputs---an exception is made for headers when other tools have clashing preferences for generated sources; e.g., autotools wants to output parser definitions for the presentation language as pl.hpp.

Preprocessor macros

Preprocessor macros should be avoided when possible.

Preprocessor macro names are ALWAYS_UPPERCASE.

Binary macros

A binary macro is a symbol that has two states. There are two kinds of binary macros:

  • A macro that is either "defined" or "undefined." These macros should have names that are nouns or noun phrases. The should not be defined to have a value—their definition should be empty. For example:
 #define DEBUG

These macros can be tested using the #ifdef directive:

 #ifdef DEBUG
 /* ... */
 #endif /* DEBUG */
  • A macro that has one of the values 0 or 1. These macros should have names that are verbs or verb phrases. For example:
 #define USE_MINISAT 1

These macros should be tested using if statements rather than preprocessor directives:

 if(USE_MINISAT) {
   /* do something with minisat */
 } else {
   /* some non-minisat default behavior */
 }

Block macros

Macros that expand to multiple statements should use the do { ... } while(0). For example, a macro FOO that generates the statements S1; S2; should be defined using

   #define FOO do { S1; S2; } while(0)

(Note that there is no semicolon after while(0).)

Miscellaneous

  • Since macro parameters may be expressions with side effects, they should not be used multiple times in a definition.
  • Remember these aren't hygienic macros! You can capture names:
  #define ADD(c, a, b) { int x = a, y = b; c = x + y; }
  int main() {
    int x = 5;
    // print out the value x + x ^ 2 (should be 30)
    int z;
    ADD(z, x, x * x);
    printf("%d\n", z); // does NOT print 30  :-(
  }

The usual way to combat this problem is to prefix the names in the macro with underscores:

  #define ADD(c, a, b) { int __x = a, __y = b; c = x + y; }

Though it's best to use longer names and perhaps to mix in the name of the macro as well, to avoid any possible clash.

  • Except when you're playing clever/stupid tricks with macro expansion, you should guard against unwitting order-of-operations surprises by parenthesizing arguments when expanded in macro bodies:
  #define ADD(c, a, b) { int __add_x = (a), __add_y = (b); (c) = x + y; }

This approximates (in this case) macro-call-by-value.

  • There is an exception to the ALWAYS_UPPERCASE rule for CVC4::Assert() and friends.

Class names

  • Classes, and type names in general, are in CamelCase. There are exceptions to this rule, however, where lower case names with embedded underscores are acceptable:
    • STL type names are not CamelCase, and when implementing traits and STL-like typedefs (iterators for instance), you should match the STL style as closely as possible:
 class Foo {
   class Item;
   typedef Item*       iterator;
   typedef const Item* const_iterator;
 };
    • for low-level exception types (similar to STL's bad_alloc). However, for derived classes of CVC4::Exception, the names should be CamelCase.
    • Low-level unions and structs may be in a similar lower case form, e.g., low_level_struct.
  • Note that acronyms (SMT, CVC4) inside type names should generally be lowercase after the first letter, such as in CVC4::SmtEngine. This is preferred to SMTEngine or SMT_Engine, but use judgment on a case-by-case basis.

Namespaces

  • Everything is in the CVC4 namespace or a sub-namespace. Generally, the sub-namespace follows the module (and thus the source directory) structure. The name remains uncapitalized, and acronyms are all in lowercase letters (e.g. CVC4::smt, CVC4::main, CVC4::parser). However, certain base types go in CVC4 directly instead of a sub-namespace despite belonging to a module. These are user-visible (such as CVC4::Exception), core functionality classes, and utility classes from src/util/.

Member names

  • Data members are prefixed with d_, and are generally in lowerCamelCase after that. For example:
 class Foo {
   int d_thisIsAnInt;
 };
  • Static members are prefixed with s_:
 class Foo {
   static int s_thisIsAStaticInt;
 };
  • Union elements are prefixed with u_.
  • There is an exception to these rules for low-level struct types that have no member functions and no destructors (and only simple, initializing constructors). For an example see the definition of CVC4::Options in src/smt/options.h.

Class Organization

Declarations of class members should be organized as follows.

  • Members are declared in the following order:
  1. Typedefs and friend declarations
  2. Static members
  3. Data members
  4. Constructors and destructors
  5. Methods
  • public, private, and protected sections may come in any order. Alternation between sections should be minimized as much as possible.

Communicating with the user: the output classes

ALL OUTPUT through Debug and Trace and Notice and Warning functionality please! Very easy:

 #include "util/output.h"
 CVC4::Debug("arith", "foo!"); // prints "foo!" if arith debugging is on
 CVC4::Warning("Equivalence classes didn't get merged."); // prints unless user gave -q or in SMT-COMP mode
 CVC4::Notice("Hi, so you don't get bored, here are some statistics: " + stats); // if user gives -v
 CVC4::Chat("I feel like doing some theory propagation.");// if user gives -vv
 CVC4::Trace("arith", "<ralph>I'm now calling a function!</ralph>");// if user gives -vvv

Chatting in particular should be useful for tuning/strategy things. Parts of the code that aren't conditional on particular settings in CVC4 should avoid Chat and use Trace or Debug. Tracing is tagged with the module to which it's relevant. The above add a \n for you. If you don't want that, you can use a printf() interface:

 CVC4::Debug::printf("arith", "foo!"); // no line termination

or the stream interface:

 CVC4::Debug("arith") << "foo!"; // also no line termination

For non-debug builds, CVC4::Debug is a no-op and everything will be inlined away.

Assertions

CVC4 Assertions:

 #include "util/assert.h"
 CVC4::Assert(condition); // asserts condition is true
 CVC4::Assert(condition, "error message on fail"); // asserts condition is true, custom error message

Note that Assert is a macro (in order to collect source line and file and expression-string information from the C preprocessor). A CVC4::AssertionException is thrown if it fails. When assertions are turned off, this is a no-op and everything is inlined away.

The above assertions can be turned off (with --disable-assertions at configure-time). You might want to have an assertion be checked even if assertions are off. For example:

  1. Unreachable code. If you have a section of code that is intended to be unreachable (you have, perhaps, proved on paper that the code should never enter this state), it costs nothing to check the assertion (it's essentially an assert(false)---always fail).
  2. Highly-compromised state. In cases where a wrong answer is unlikely and the assertion check is cheap and infrequently-run, you may consider leaving an assertion in. This is the case also with the above unreachable-code assertion.

For unreachable code, use the Unreachable() macro. It will throw a CVC4::UnreachableCodeException (a subclass of CVC4::AssertionException):

 #include "util/assert.h"
 CVC4::Unreachable(); // flags (supposedly) unreachable code; fails even under --disable-assertions
 CVC4::Unreachable("error message"); // custom error message

Another special case is an unhandled case (for example, a default block in a switch that was designed to handle all cases). It will throw a CVC4::UnhandledCaseException (a subclass of CVC4::UnreachableCodeException):

 #include "util/assert.h"
 CVC4::Unhandled(); // flags an unhandled case; fails even under --disable-assertions
 CVC4::Unhandled(the_case); // same but prints out the_case that's unhandled as well (which should be whatever is in the switch)
 CVC4::Unhandled("error message"); // custom error message
 CVC4::Unhandled(the_case, "error message"); // custom error message with the_case that is unhandled

For a strong assertion that's checked even with --disable-assertions, use an AlwaysAssert(). It throws a CVC4::AssertionException just as Assert() does.

 #include "util/assert.h"
 CVC4::AlwaysAssert(condition);
 CVC4::AlwaysAssert(condition, "error message");

Doxygen comments

Exporting library symbols

When built with a GNU toolchain, libcvc4 hides all symbols by default (with -fvisibility-hidden). This is a good way to ensure that the public interface is complete, that no undocumented interface is used by anyone, and it also permits more aggressive optimization of the library. Those symbols intended to be the public interface to CVC4, however, must be explicitly exported.

Fortunately, this isn't difficult. To use a symbol, a user should have a header file anyway. Only a subset of CVC4 headers are installed with make install, and these all live in src/include/ and comprise the complete public interface. Public symbols are marked with CVC4_PUBLIC:

 class CVC4_PUBLIC Expr {
   /* ... */
 public:
   Expr();
 };

Note here that types and private members need not be marked CVC4_EXPORT.

Now. What do you care? Well, if you ever forward-declare something that's public, you have to mark it PUBLIC in the forward-declaration too. (Is this true??)

Exception classes should all be marked CVC4_PUBLIC, because they might end up in user code (external to libcvc4) and their type_info information must be available to the catch block in user code. If you don't think your exception can (nor should by design) exit the library and end up in user code, then first double-check this, then triple-check it, then document that fact clearly at call site, the catch site (inside the library), and at the exception class definition.

See http://gcc.gnu.org/wiki/Visibility for more information on visibility.

Source file layout

Class header files

  • A class named AdjectiveNoun has its definition in adjective_noun.h. Each header file is guarded by a preprocessor macro to prevent duplicate declarations. For example, in adjective_noun.h:
   #ifndef __CVC4__ADJECTIVE_NOUN_H
   #define __CVC4__ADJECTIVE_NOUN_H
   
   ... declarations go here ...
   
   #endif /* __CVC4__ADJECTIVE_NOUN_H */

Generally all #includes of other headers should go inside the #ifndef guard. There is an exception though (with an #include going above the guard) for circular dependencies and other special circumstances. Such things should be documented in the code. All #includes should generally go at the top of the file, however, for inlined and template implementations following a class definition, sometimes additional files should be included after the class definition but before these member function definitions (again, when breaking circular dependencies).

If the class AdjectiveNoun is in the CVC4 namespace, the macro guard is named __CVC4__ADJECTIVE_NOUN_H. If it is in a sub-namespace Foo, then the macro guard is __CVC4__FOO__ADJECTIVE_NOUN_H. Note the two underscores for each :: and the two prefix underscores. One underscore is used between words in a type or namespace name.

  • Prefer forward-declaration of classes instead of #include'ing their header from another header, wherever this is possible.
  • Public headers (in src/include/), use preprocessor macro guard __CVC4_headername_H, for example, __CVC4_EXPR_H for cvc4_expr.h. Note here the one underscore between CVC4 and the name.
  • Other (non-class) header files should use a similar preprocessor macro guard, based on relevant module and header file name, as class header files do. In all cases, prefix the name of the guard with __CVC4_ and design it to be unlikely that it will ever conflict with any other (future) CVC4 header, whether it might be for internal use only or externally provided to a library user.
  • Everything should be in the CVC4 namespace, or a sub-namespace of CVC4 (based on module).
  • For test classes. Unit test classes (under test/unit/) are implemented as header files and do not need to be in a namespace, nor guarded by a preprocessor macro. The idea is to keep them as simple and clean as possible, so the tests can be easily read. They are special-purpose and compiled and linked individually, so they don't need these protections.

Class implementation files

  • A class named ThisIsAClass should have its implementation under src/ in this_is_a_class.cpp in the relevant module directory.

Imported sources

  • For imported sources. Make sure to add an exclusion to the contrib/update-copyright.pl script so that a CVC4 copyright isn't added when the script is run.

Source file headers

  • A common comment blurb heads all sources in CVC4 (including lex and yacc inputs and unit tests, but excluding imported sources).
  • A script, update-copyright.pl is used to maintain/update this comment blurb.

update-copyright.pl

  • You can run contrib/update-copyright.pl to update all sources with a new copyright template. (Some) efforts are made to throw away old boilerplate comment blurbs while keeping file-specific comments.
  • PLEASE BE CAREFUL. This updates files in place without keeping a backup. The script is probably buggy. You have been warned. It's best to do this on a copy of your source tree, or on a fresh repository checkout. (The program will warn you too.)
  • Please consult svn diff before committing changes made by the script.
  • You can run the update script on specific files/directories by providing arguments to the script. By default it processes all sources in the src/ and test/ directories.

Using emacs

  • There's an emacs-lisp snippet available that you can drop in your ~/.emacs to conform to spacing, indentation, etc., in the CVC4 source tree.
  • See contrib/editing-with-emacs.

Textual source layout

Whitespace/tabs

  • No tabs, please.
  • No whitespace between if, for, while, 'catch', etc., and their opening parenthesis.
  • One space between closing paren and open curly brace (except if aligning to other nearby source lines).
    • The matching right curly brace goes on a line by itself (possibly with an end-comment tagging the construct it's part of) in the same column as the first letter of the construct keyword. (That is, the right-brace lines up with the if, for, etc.)
  • One space around binary operators to avoid crowding, e.g.:
  x = y;
  z = y + 1;
  • In complicated arithmetic expressions, even when parentheses aren't required by operator order of evaluation, sometimes parentheses and extra spacing and alignment should be used for visual clarity, or to match more closely what is being implemented, what is described in associated source code comments, or both. For example:
  if( ( ( x1 + y1 ) - ( z  * 2 ) ) <=  4 ||
      ( ( x2 + y2 ) - ( zp * 2 ) ) >= -4 ) { ... }
  • Occasionally to emphasize a !, so it doesn't get visually lost, it's best followed or even surrounded by space. For example, this is bad:
  if(!(x + y > 4 && z < x)) ...
  • while this is much preferred, since it emphasizes the negation of the entire condition:
  if( !( x + y > 4 && z < x ) ) { ... }
  • (assuming that it's preferable not to rewrite this as x + y <= 4 for some other reason of course).
  • Between type names and */& when declaring members and variables, to emphasize the introduced name is a pointer:
 Foo *a, *b;
  • NOT between type names and */& when in an argument list or return type, to emphasize that the type is a pointer:
 Bar* fooToBar(Foo* f);
  • There is an exception for array types. Since char *argv[] is actually an array of pointers, it doesn't make sense to attach * to the type in this case (in C/C++, the [] must come after the formal name and can't be attached to the type name, as it can be in Java). In particular:
 int main(int argc, char *argv[]);

is how the signature for the main function is written.

  • No trailing spaces please. M-x delete-trailing-whitespace in emacs might help.

Increment/Decrement

  • Preincrement/decrement preferred over postincrement/-decrement, e.g.:
  for(int x = 0; x < 10; ++x) { ... }

Bracing

  • Braces are not required around one-line blocks, but preferred in complicated cases, especially with if (to avoid visually-ambiguous elses).
    • E.g. this is fine:
   if(extra_output)
     do_something();
    • But this is debatable (preferring braces on the if() but not the for():
   if(extra_output)
     for(int i = 0; i < 10; ++i)
       do_something();
    • And this is horrible:
   if(extra_output)
     do_something_simple();
   else
     while(!done)
       for(int i = 0; i < 10; ++i)
         if(a[i] < 5)
           a[i + 10] += a[i];
  • Generally, if the then part of an if is complicated and braced, but the else part is simple, the else need not be braced but may be. But if the else part is complicated and braced then the then part should be (even if it's simple).
  • If in doubt, brace.
  • If anything is visually ambiguous, brace for clarity.

Indentation

  • Indentation level is 2.
  • No indentation under switch() {} or namespace {}
    • An exception is forward-declarations inside namespaces, where you do indent. For instance, from src/include/cvc4_expr.h:
  namespace CVC4 {
  
  namespace expr {
    class ExprValue;
  }/* CVC4::expr namespace */
  
  class Expr {
    /* ... */
  }/* class Expr */
  
  }/* CVC4 namespace */
  • Goto labels, case labels, and "private/public/protected" are recessed 2.
  • template <class T> generally belongs on its own line before a function declaration or definition. An exception is inside a class file where there's a long string of simple functions that are best not separated by vertical space.
  • If something is templated in this way, the entire construct must be vertically separated from its surroundings with blank lines, and the template <> specification is on its own line.
  • Attempt to keep lines under 80 columns wide. Though if this makes things excessively ugly or confusing, it's probably ok to go over 80.
  • Indentation level is 2 for preprocessor directives as well, but the leading # is always in the first column. For example:
 #if A
 #  if B
 #    warning A B case
 #  else /* ! B */
 #    warning A -B case
 #  endif /* B */
 #else /* ! B */
 #  if B
 #    warning -A B case
 #  else /* ! B */
 #    warning -A -B case
 #  endif /* B */
 #endif /* A */

Misc C++

  • int foo(void) -- (void) unnecessary
  • An array reference a[i] in C++ may also be reversed as i[a], due to a symmetric definition of operator[] when applied to array and integral types in C. In CVC4, array indexing should be done in the standard way.
  • never use "using" in header files
  • enums require operator<<

Intentionally not covered

  • 0 vs NULL

Adding a new Theory

One of the first things that new developers need to do is add a new theory. Here's the overall procedure:

  1. If your theory is likely to shake things up in other parts of CVC4 for awhile, branch the repo (see Branching and merging). Make sure your working copy is up to date.
  2. Add a new directory for your theory (assuming you're implementing the Theory of Foo, you might use: mkdir src/theory/foo)
  3. Use addsourcedir (see the Developer's Guide section on that script) to set up the directory (contrib/addsourcedir src/theory/foo).
  4. Add your directory to SUBDIRS in src/theory/Makefile.am.
  5. Add a kinds file specifying kinds for your theory. The file can be empty, to start.
  6. Add these files to the repository (svn add src/theory/foo)
  7. Add Makefile.in to the svn:ignore property of your directory (svn ps svn:ignore Makefile.in src/theory/foo)
  8. Re-run autogen.sh at the top-level. This finds your new directory and adds it to the list of things to set up when configure runs.

You'll now need to implement your theory! Any source files you add will need to be listed in the appropriate SOURCES variable in your theory's Makefile.am. Any "extra things" (README files, notes, anything else that isn't really a "source" but should be included in the CVC4 tarball) should be listed in EXTRA_DIST.

There are places outside of your theory's directory you'll need to support, too. If you want your theory to be supported by one or more input languages (instead of just through the CVC4 library API), you'll need to add support to the appropriate parser(s), under src/parser/LANGUAGE. (Right now we don't do prettyprinting, but that will eventually be in there, too.) You'll also need to add typechecking to the NodeManager (in src/expr/node_manager.cpp), and probably you'll want to add a type_rules file to your theory implementing theory-specific type rules (currently, the only modification to NodeManager is to switch on the node kind and delegate to your theory's type rules).

For more details see How to write a theory in CVC4.

Dealing with expressions (Nodes and TNodes)

There are three types in CVC4 that conceptually represent the same thing.

  1. An Expr is a publicly-visible type that references expression data and the associated expression manager.
  2. A Node is a CVC4-internal type that references expression data but not the associated expression manager (the manager is implicitly the "current" node manager)
  3. A TNode is just like a Node except that it doesn't maintain a reference count.

Why this separation between Expr and Node?

In the library, we want to ensure that expressions are referenced correctly (that is, with deference to the appropriate expression manager). We can police our own use of expressions, but we can't do the same for the users of the library. Therefore we require external use of CVC4 expressions through the "heavier" Expr object to ensure correctness, but internally we want to use the lighter-weight Node objects.

Similarly, ExprManager and NodeManager are split, but are conceptually the same. NodeManager does all the real "managing" of expression data (and it has construction and reclamation-related duties) and ExprManager is the user-visibles, public interface to this manager.

Under the hood, Expr objects are two simply two pointers: one to a Node and one to an ExprManager, and all its implemented functionality simply delegates to one or the other.

Why this separation between Node and TNode?

Internally, then, a Node is just a "smart" pointer to a NodeValue structure, which contains expression kind information, the reference count, pointers to child expressions, etc. Even internally to the CVC4 library, we don't allow general access to NodeValue---we require all expression manipulation to go through Node objects. Conceptually, Node objects are simple; they simply delegate to the underlying NodeValue. However, they have another purpose---to maintain the reference count on the NodeValue. When a Node object is created pointing to a NodeValue, it increases the reference count on that NodeValue. When the Node is destructed (or points away), it decrements the reference count.

This handling of the reference count can be expensive as Node objects are passed between functions, stored in data structures, updated on backtrack, etc. Sometimes this is necessary for safety---that is, to maintain a correct reference count. Other times, though, it's not necessary (see below), and the use of TNode in preference to Node removes the reference-counting overhead.

Under the hood, one template type (the NodeTemplate<>) implements both Node and TNode---Node and TNode are typedefs to instantiations of the NodeTemplate<> template with (respectively) reference-counting turned on and off.

Why this separation between TNode and NodeValue?

One could ask, then, why does TNode exist if it conceptually provides the same services as the underlying NodeValue and it doesn't reference-count it? Couldn't we just access the NodeValue directly? There are two main reasons. First, in assertions-enabled builds, TNode objects include extra assertions to make sure they're being used correctly---that is, that a TNode never references a NodeValue with a zero reference count, which is unsafe since that NodeValue is eligible for reclamation by the NodeManager. Second, the interface on NodeValue is somewhat lower-level than on Node and TNode, and using the higher-level forms allows one to interchange Node and TNode without compiler troubles (one should only do so where it's safe to do so, of course). Additionally, the use of TNode ensures that the expression data is not modified---conceptually, expressions are immutable, but of course at the lower layers of the expression package, they aren't technically immutable, so passing around a const NodeValue* would not be appropriate.

When to use TNodes

So when should one use TNode and when should one use Node when referring to expression data?

This is a difficult question, but there are some general rules.

First of all, it's always safe to use Node, so if you're erring on the side of caution, use Node and not TNode.

TNode is safe to use only when you know that a Node exists that refers to the same underlying expression data. Since each NodeManager maintains the invariant that each expression is represented uniquely by one NodeValue, that means that if a Node is guaranteed to refer to an expression, a TNode can be used to reference the same expression. At such point as the Node no longer exists, it is no longer safe to use the TNode.

Using TNode on the stack

In well-scoped situations like function calls, it's then ok to use a TNode:

 void foo() {
   NodeManager* nm = NodeManager::currentNM();
   Node a = nm->mkVar("a", nm->booleanType());
   Node b = nm->mkVar("b", nm->booleanType());
   Node n = nm->mkNode(kind::OR, a, b);
   Node not_n = negate(n);
 }
 
 Node negate(TNode tnode) {
   Node x = tnode.notNode();
   return x;
 }

Here, foo() calls function negate() to negate a Node. However, the function accepts the node parameter as a TNode---this is ok, because a TNode can be constructed from a Node. At the function call, a copy of the underlying pointer to NodeValue is made, but no reference counting is done. Note that when an expression is created (with tnode.notNode()), however, the result is stored in a Node. It would not be safe to store this in a TNode, since the negation of the parameter is not guaranteed to have an outstanding Node reference to it (and thus to have a nonzero reference count).

The general rule is: functions should take TNode arguments but always return Node. There are exceptions, but this one rule covers a lot. A common mistake is reusing a variable of type TNode where it should be a Node, for example:

 Node preprocess(TNode input);
 
 void query(TNode input) {
   input = preprocess(input);// <--- BUG!!
   // perform query...
 }

This is a serious error. preprocess() will (it is presumed) take input and either create a new equivalent expression, returning that, or if no preprocessing is required, return the input. But here, the Node return value of preprocess() is turned into a TNode in the caller's context, and the reference count on the expression data is decremented. This means the result of preprocessing may be reclaimed by the system immediately on return of the preprocess() function! The correct way to write query() is the following:

 void query(TNode input) {
   Node preprocessed = preprocess(input);
   // now work with 'preprocessed'...
 }

Using TNode in (heap-allocated) data structures

So far, this only handles stack-based (function call) uses for TNode. TNode objects can also be kept in data structures, but that's more complicated. I repeat again, from the above: At such point as the Node no longer exists, it is no longer safe to use the TNode. That has an important further clarification: even if the Node is re-established later, it's unsafe to maintain a TNode to the expression, since the underlying NodeValue may have been reclaimed by the system and re-constructed! Consider this code:

 void foo() {
   NodeManager* nm = NodeManager::currentNM();
   Node p = nm->mkVar("p", nm->booleanType());
   Node q = nm->mkVar("q", nm->booleanType());
   Node n = nm->mkVar(kind::OR, p, q);
   
   TNode x = n; // safe reference to expression (p OR q)
   
   Kind k = x.getKind();// <-- safe: get the kind of expression (p OR q)
   
   // point away from (p OR q)
   n = nm->mkVar(kind::AND, p, q);
   
   Kind k = x.getKind();// <-- UNSAFE: (p OR q) might not exist anymore!
   
   // reconstruct (p OR q)
   n = nm->mkVar(kind::OR, p, q);
   
   k = x.getKind();// <-- ALSO UNSAFE, (p OR q) might be in a different place in memory
   x = n;
   k = x.getKind();// <-- this is safe
 }

As you trace through this code, keep in mind that if the reference count for an expression drops to zero (there are no Node references to it, only TNode references), then it is immediately eligible for reclamation by the NodeManager. So, here, x is initially a safe TNode reference to (p OR q)---but then n points elsewhere, and x is no longer safe to use. You might think that, given that the NodeManager maintains unique underlying expression structures for all expressions, that after re-constructing (p OR q), you could use x again. But this is not true---the expression data may have been destroyed in the meantime, and thus the reconstructed (p OR q) could be a different (but still unique, since the other one has disappeared) NodeValue. This makes x unsafe to use.

That means that you cannot store a TNode in a data structure unless a Node to the same expression is guaranteed to exist at least as long as the TNode does! Even if you can guarantee that a Node exists every time to use the TNode, this isn't enough: a "solid" Node reference must exist at all times to the underlying expression. If you cannot guarantee this, you cannot use a TNode, you must use Node.

There is no "general rule" for putting TNode in heap-allocated data structures, as there is with function calls on the stack. Sometimes you can store Node objects in one data structure, and that permits the storage of TNode objects in another; but it depends on your data. You just have to make sure that reference count stays above zero at all times.

By the way, in the above program, often a re-created expression does have the same underlying NodeValue, since zero-reference-count expressions are not immediately reclaimed (they are in the so-called zombie state, and they are reclaimed in bulk when there are enough zombies). But because they are immediately eligible for reclamation, you cannot rely on this.

Node comparisons

You can compare Node and TNode using the usual comparison operator (<,>,<=,>=). The result of these operators will remain stable until the underlying NodeValue disappears. So if some of your datastructures depend on the order of Nodes you should keep them alive somewhere, since when a NodeValue is recreated it will compare differently than the disappeared NodeValue.

Use of autotools: automake, autoconf, autoheader, libtool

  • Please don't check into the repository things generated by automake, autoconf, or autoheader, for example: config.h.in, Makefile.ins, Makefiles, ylwrap/libtool/aclocal.m4/missing/config.guess, configure, etc.

Commit policy

  • Before you commit, everything should be documented and your code should conform to the coding guidelines.
  • Before you commit, your source tree should pass a "make check" in both debug and production configurations.
  • Ideally, your code is cross-platform, and at the very least works equally well on 32- and 64-bit Intel, but this can be difficult to determine sometimes.
  • An automated bot will review code commits and complain loudly to you if it doesn't like your documentation or coding style [wishlist---not yet implemented, except for a few newswire items].
  • After you commit, Clark will review the commit for content. (This is after the fact, you don't have to get a sign-off to commit.)
  • After Clark commits, others in the group will review his commits for content. (Also after the fact.)
  • In practice, everybody reviews everyone else's commits, but on their own schedule.
  • Unit tests / system tests / regression tests should be included to test any new functionality included in your commit.
  • Any new code should be well-covered. Please check the coverage report the next day to make sure your code is well-covered by unit tests, system tests, and regressions. If not, add some tests to the source tree.

Unit tests with CxxTest

There are two unit tests for modules:

  • those written by the main author of the module/piece of functionality (white-box)
  • those written by another group member to test the exported interface of the module (black-box)

Coverage testing with gcov

Every non-exceptional line of code should be tickled by a unit test. To generate code coverage information, configure CVC4 with --enable-coverage, recompile the project, then run make check.

The code coverage information for a single file SRC_PATH/foo.cpp (and its #include'd dependencies) can be obtained by executing:

   gcov -o builds/ARCH/BUILD_ID/SRC_PATH SRC_PATH/foo.cpp

If you have multiple builds configured simultaneously, remember to use the correct BUILD_ID! It will have coverage in its name, something like debug-coverage or production-coverage. This will print out the percentage coverage of each processed file and create a set of .gcov files in the current directory. The .gcov files reproduce the source code with execution counts. More information about gcov usage is available from the online manual.

You can use the lcov tool to produce a more global, browseable coverage report. (lcov is installable via APT and the package includes both lcov and genhtml, used below). Simply run make lcov.

If that doesn't work for some reason, you can try running lcov manually. After running make check with a coverage-enabled build, execute:

  mkdir -p HTML_DIR
  lcov --directory builds/ARCH/BUILD_ID/ --capture --output-file HTML_DIR/cvc4-gcov.info
  cd HTML_DIR
  genhtml cvc4-gcov.info

Open HTML_DIR/index.html to start browsing.

Other coding guidelines (as yet uncategorized)

  • No using or using namespace in header files, please!
  • All exceptions thrown by CVC4 should be types publicly derived from CVC4::Exception, unless the exception is very low-level (similar to std::bad_alloc) and there's a compelling reason to avoid the CVC4 infrastructure (this is true, for instance, for non-exceptional signals that are thrown internally by CVC4 and never escape the library; it's preferable to avoid memory allocation in this case).
  • Traditionally long declaration blocks (classes, namespaces, #endifs) should have an associated comment, even when they are short:
 namespace CVC4 {
 class Foo {
   /* pages
    * and
    * pages
    * go
    * by
    */
 };/* class Foo */
 }/* CVC4 namespace */

Others (if, while, switch, function definitions, etc.) should be close-commented for clarity where necessary. No space exists between the last token of the declaration or definition and the start of the close-comment.

Always end with a comment that matches the declaration ("struct Foo", "class Foo", "enum Foo"), except for namespaces, where it's the "Foo namespace." Comment preprocessor macros similarly:

 #ifdef SOMETHING
 #  ifdef OTHERTHING
 #    error You can't do that.
 #  else /* !OTHERTHING */
 #    warning This is experimental code.
 #    define FOO(x) x
 #  endif /* OTHERTHING */
 #else /* SOMETHING */
 #  define FOO(x)
 #endif /* SOMETHING */

Note that the indentation level for preprocessor macros is 2, but that the leading # is always in the first column, and that the #else takes the negation of the condition as a comment (but #endif takes the non-negated one). Here, note that the else-comment and close-comment is preceded by a single space. In cases where these blocks are long and you want to include additional context information about what macros are true, please do:

 #ifdef SOMETHING
 /* lots of
  * code here
  */
 #else /* !SOMETHING */
 /* lots of
  * code here
  */
 #  ifdef OTHERTHING /* && !SOMETHING */
 /* code... */
 #  endif
 #endif
  • Code and preprocessor indentation levels need not match unless it makes sense for them to.
  • ALWAYS COMMENT FALL-THROUGH CASES IN switch() {}:
 switch(foo) {
 case THIS_ONE:
   printf("this!\n");
   /* fall through */
 case THAT_ONE:
   printf("that!\n");
 }

Vertical spacing

Vertical spacing depends on context. Logically group statements. Debugging bits should generally be offset from the rest of the code by blank lines. switch() {} blocks should generally leave a space between a case label and the preceding break.

Code reviews

The typical code review lifecycle.

A code review ticket is opened in Bugzilla with importance "review". The scope of the review is indicated in the comments.

If you are assigned a code review, here is a guide to the life cycle for your review ticket:

  1. Familiarize yourself with the code in the assigned module (and possibly other, related modules).
    1. If something simple is unclear, ask the developer in person or via email.
    2. If something less-than-simple is unclear, please ask the developer publicly on the ticket, so the discussion is tracked and better documentation can be produced as part of the code review.
    3. If something less-than-simple is unclear and has larger, team-level consequences, ask the developer publicly on the developers' mailing list and/or raise it at a developers' meeting.
  2. Review the code of the assigned module.
    1. Simple code formatting, typos, etc., should be directly fixed and committed by the reviewer.
    2. Slightly larger issues (refactoring, renaming of critical variables, methods, etc.), even when it's to bring the code in line with coding guidelines, can be directly committed but should be communicated to the developer of the code and noted on the Subversion revision log and the code review bug ticket.
    3. Larger, architectural changes to the module should be worked on WITH the developer, not solely by the code reviewer.
    4. If, during the course of review, additional changes are indicated to modules other than the one formally under review, those should be opened as an additional review/bug ticket, or mentioned to the developer of those modules for correction.
    5. In general, please avoid having "discussions" with the developer in comments in the source code.
  3. Bugs, requests for documentation, etc., should be filed as separate bug tickets in bugzilla, with the module's original developer as the assignee.
    1. Simple things (questions, or requests to "document this please," etc.) can be included as part of the review ticket. As the reviewer, post your questions and requests, and re-assign the ticket to the developer. When the developer answers and addresses the issue, the developer should re-assign the ticket to the reviewer.
    2. More complex things (separate these two modules, provide documentation of invariants to this class, re-architect this to be in line with this other thing, etc.) should be opened as tickets, one per request, separate from the review ticket. These new tickets are assigned the developer. The review ticket itself does not get re-assigned to the developer unless there are (simpler-to-address) questions and requests on the review ticket.
  4. Black-box and Public tests should be written by the reviewer for the general contract of the class.
    • If the reviewer chooses, the reviewer may write some white-box tests also, to assert internal state of the class is what the reviewer expects.
  5. Ensure the code in the module is sufficiently covered by tests.
    • For guidance on what is sufficient, please review the output of gcov and apply common sense, as informed by experience. All methods should be called and at least partly covered. If important methods aren't covered, certainly they should be. It's probably okay if some branches of exceptional paths are not covered. Some paths are difficult to cover in a black-box or regression test and may need to be triggered in a white-box test.
  6. In the documentation at the top of all relevant file(s), mark yourself as the reviewer.
    • If there are other reviews listed, put yours at the end of the list (so they remain in chronological order).
    • List your name, the date, and the code review ticket number.
    • For example:
      /*********************                                                        */
      /** ecdata.cpp
       ** Original author: taking
       ** Major contributors: mdeters
       ** Minor contributors (to current version): none
       ** This file is part of the CVC4 prototype.
       ** Copyright (c) 2009, 2010  The Analysis of Computer Systems Group (ACSys)
       ** Courant Institute of Mathematical Sciences
       ** New York University
       ** See the file COPYING in the top-level source directory for licensing
       ** information.
       **
       ** Implementation of equivalence class data for UF theory.  This is a
       ** context-dependent object.
       **
       ** Reviewed by dejan, Feb 24 2010 (bug #10).
       ** Reviewed by mdeters, Apr 4 2010 (bug #64).
       **/
      
  7. The ticket can be closed when:
    • tests cover the code sufficiently, and
    • the simple requests in the review ticket itself have been addressed, and
    • bugs have been opened to address more complex requests.
    The more complex requests need not be addressed to close the review ticket; it is enough that they have been identified and entered into Bugzilla.

Binary compatibility: what is it and why do we care?

Working with the cluster

We have a cluster of nine execution hosts at NYU, smt_crunch_01...smt_crunch_09. You don't interact with these nodes directly; instead, you interact with church through the web or ssh interface.

Submitting a job to the cluster via the web interface

Build a fully statically-linked binary (usually with something like ./configure production-cln-staticbinary) and submit it via the web interface.

Submitting a job to the cluster via the command line

Some of us have been passing around scripts that run various cluster jobs. The simplest form is:

#!/bin/bash
# cluster-regressions-cvc4
TO_EMAIL="`whoami`@cs.nyu.edu"
CC_EMAIL=""
function RunClusterRegressions() {
  local job_name="cvc4-$1"
  local cvc4_binary="$2"
  local params="--segv-nospin $3"
  local comment="$4"
  local timeout="$5"
  if [ -z "$timeout" ]; then
    timeout=300;
  fi
  mysql -u smt_cluster --password=`cat ~/.mysql_password` smt_cluster <<EOF
    insert into Jobs values(DEFAULT, "$job_name", "custom job:$comment", "22", "$timeout", "2000", "24", "$TO_EMAIL $CC_EMAIL", "2", "$cvc4_binary", "$params", "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,29", DEFAULT);
EOF
}

if [ $# -ne 3 -a $# -ne 4 ]; then
  echo "usage: $0 cvc4-binary \"solver params\" \"comment\" [timeout]"
  exit 1
fi

JOB_TIME=`date +%s`
NAME="cluster-regressions-$JOB_TIME"
RunClusterRegressions "$NAME" $1 "$2" "$3" "$4"

If this script is called cluster-regressions-cvc4, you might run it like this:

./cluster-regressions-cvc4 /full/path/to/cvc4/static/binary '--stats' 'running my crazy idea with stats enabled'

which will run it on our standard regression suite for CVC4 with stats enabled and mail you the results. The "22" in the SQL insert is the "runner"---this is the code for an "arbitrary tool with parameters", which a general, catch-all case that you will generally want to use. The "24" is the problem set ID, which is (here) our set of CVC4 regressions. You can also choose all of QF_LRA in SMT-LIBv1 (problem set ID 27), the same in SMT-LIBv2 (ID 28), all of QF_UF in SMT-LIBv2 (ID 38), all of QF_AX in SMT-LIBv2 (ID 39), all of QF_BV in SMT-LIBv2 (ID 40), and others.

Note that you must have ~/.mysql_password installed in your home directory. See someone to get the password if you don't know it.

Adding benchmarks to the cluster

  1. If the benchmarks are new copy them to the appropriate directory on church /usr/local/share/benchmarks. Make sure they are owned by acsys-user and are readable by all users. All cluster nodes remotely mount the benchmark files from church, so there is no need to manually sync them (like was previously done). Lets say that your benchmarks are in /usr/local/share/benchmarks/mybench
  2. Make a file to that contains all the benchmarks with for example 'find /usr/local/share/benchmarks/mybench -name "*.smt" > bench.txt'
  3. Use the /cluster/cvc4/scripts/add_benchmarks.sh script to add the benchmarks. The is interactive and will ask you about all the necessary information. If you are adding problems to an existing problem set you will be given the option to continue running some existing jobs on the newly added benchmarks.

Unit testing

First, get CxxTest from here:

 http://cxxtest.tigris.org/files/documents/6421/43281/cxxtest-3.10.1.tar.gz

Unpack it in, say, your home directory. Make sure that the cxxtestgen.pl script has eXecute permission. Then, just point CVC4's configure to it:

 ./configure --with-cxxtest-dir=~/cxxtest

Now, make check will run the CxxTest unit tests under test/.

Running unit tests

The simplest way to run all unit tests (and regressions and system tests) is to run make check in the top-level source directory. If you specifically want to run units and not other kinds of tests, make check in just the unit tests directory:

$ make -C test/unit check

There is also experimental support for running unit tests related to a specific package. If you specifically want to run units for the expr package, for example, make check in the source directory for expr (not in the test directory):

$ make -C src/expr check

Types of unit tests

There are three kinds of unit tests:

  • white tests test the system from a white-box point of view: they have access to the private functionality of classes (that is, usual C++ access restrictions do not apply), and they are written (typically by the author of the tested class) to test the internal functions and invariants of a class implementation. If the implementation details of a class change, one or more tests will likely change too.
    • By convention, white tests are named *_white.h. For example, for the Node class, which is in the CVC4::expr namespace, the test is under the expr/ directory in node_white.h.
  • black tests test the system from a black-box point of view: they have access only to the public functionality of classes (that is, usual C++ access restrictions do apply), and they are written (typically not by the author of the tested class) to test the external interface and invariants of a class. If the implementation details of a class change, the tests do not change; if the external interface of a class change, the test should be updated.
    • By convention, black tests are named *_black.h. For example, for the Node class, which is in the CVC4::expr namespace, the test is under the expr/ directory in node_black.h.
    • a subcategory of black tests are public tests; these test the system from a public point of view: they have access only to the public, functionality of libraries---the exported symbols of various libraries. Usual C++ access restrictions apply, as do the usual linking procedures against the library. Essentially, these public tests are black tests that test the public functionality of a public class. It is useful to separate them from the black tests in order to test that the exported symbols of a library are adequate. Public tests are named *_public.h.

Adding unit tests

To add a new test, create a test header file under test/unit (possibly in a subdirectory based on the relevant CVC4 module). If your test is a white-box test (it calls private functionality of the tested class directly and was written knowing the implementation details), its name should include White, e.g., NodeWhite, and it should be in a file like test/unit/expr/node_white.h, and you should add it to TESTS_WHITE in test/unit/Makefile.am. If your test is a black-box test (it calls only the public functionality of the tested class and was written knowing the interface only), its name should include Black, e.g., NodeBlack, and it should be in a file like test/unit/expr/node_black.h, and you should add it to TESTS_BLACK in test/unit/Makefile.am. If your test is a public test (it calls only the public functionality of the tested class, which is exported in the public-facing interface of the library), its name should include Public, e.g., ExprPublic, and it should be in a file like test/unit/expr/expr_public.h, and you should add it to TESTS_PUBLIC in test/unit/Makefile.am.

If you add your test to the correct TESTS_* block in test/unit/Makefile.am, and you're using the GNU C++ compiler, it will be compiled and linked correctly: that is, if it's a white-box test, the compiler will ignore access restrictions; if it's a black-box test, it can still access hidden symbols; and if it's a public test, it must only access the public-facing, exported symbols of the appropriate library.

Test implementations

Your test methods are named test*, e.g., testNull(). They return void and take no arguments.

There are numerous testing assertions you can make inside your testing code. For example:

TS_ASSERT( 1 + 1 > 1 );
TS_ASSERT_EQUALS( 1 + 1, 2 );

There's also a TS_FAIL() that fails unconditionally:

void testSomething() {
  TS_FAIL( "I don't know how to test this!" );
}

...and a TS_WARN():

void testSomething() {
  TS_WARN( "TODO: This test still needs to be written!" );
}

The users' guide is here for your reference.

Debugging unit tests

In order to run, CxxTest takes a header file, generates from that a new cpp file, and then then compiles the cpp file into a binary. The binaries for the unit tests live in the builds directory. If a unit test had the path

 cvc4/test/unit/foo/bar.h

then the binary will have the path cvc4/builds/test/unit/foo/bar. (The CxxTest cpp file will have the path cvc4/builds/test/unit/foo/bar.cpp.)

If you have compiled cvc4 in debug mode, then the unit tests binaries can be run in gdb.

Tests are executed in the order that they appear in the header file. Moving a broken test to the top of the class definition (temporarily) makes it execute before the other tests. This is a useful trick for making debugging with gdb easier.

To recompile all of the tests, you can run make check in cvc4/. If you just run make after changing the source of cvc4 or a unit test, the test will not be recompiled.

Compile-time errors in unit tests

The most common errors with unit tests are duplicating method names, improper copy-'n'-paste between tests, etc.

If you get strange errors, make sure you called your test class something unique (e.g., "ExprBlack" for black-box tests on Expr). If you get an error that looks like this:

expr/expr_black.cpp:20: error: expected initializer before ‘suite_ExprBlack’                                                                                                                         
expr/expr_black.cpp:23: error: ‘suite_ExprBlack’ was not declared in this scope                                                                                                                      
expr/expr_black.cpp: In member function ‘virtual void TestDescription_ExprBlack_testDefaultCtor::runTest()’:                                                                                         
expr/expr_black.cpp:28: error: ‘suite_ExprBlack’ was not declared in this scope                                                                                                                      
expr/expr_black.cpp: In member function ‘virtual void TestDescription_ExprBlack_testCtors::runTest()’:                                                                                               
expr/expr_black.cpp:34: error: ‘suite_ExprBlack’ was not declared in this scope                                                                                                                      
...etc...

...then you're missing a closing semicolon at the end of your test class definition. (They're easy to forget.)

Test rebuilding

Bug 44 states:

As our unit-test database is growing the time to compile all the tests is getting increasingly unbearable, specially as every change in the source spawns recompilation of all tests. I know it might be impossible, but I would be wonderful if we could get around this somehow.

Previously, in response to bug 13, entitied Test not rebuilding on code changes, the behavior was fixed precisely to force test rebuilding

Fortunately, the answer is yes, this rebuilding can be avoided.

But it must be stressed that before any commit, you should always let the build tree rebuild, relink, and rerun all tests!

Partial testing

First, the preferred method involves partial rebuilding: you can run subsets of unit tests (rebuilding only those necessary for the subset). Say you have an up-to-date build tree (including all unit tests). You make a small change to src/theory/theory.h and want to rerun just the theory tests. Run make to rebuild the libraries, then:

 $ make -C src/theory check

By the way, -C isn't anything magical, it just chdir's into src/theory first. So the same results from:

 $ (cd src/theory && make check)

With the current build tree (that is, March 4, 2010), this will rebuild only the theory_black and theory_uf_white tests and then run them. When we have more theory-related unit tests, it will include those too.

Marking the test tree up-to-date (not preferred)

If you are using dynamic linking (--enable-shared, which is the default), and you make a change to libcvc4 or libcvc4parser that shouldn't require a rebuild of test binaries---a small change to a non-inlined function, for example---you can use:

 $ make
 [ libraries are rebuilt ]
 $ make -t check
 [the tests are marked up-to-date even though they aren't]
 $ make check
 [the tests are run]

You cannot combine these three steps. Providing "-t" to make won't do any rebuilding, which you need in the first and third invocations.

If you change interface, data layout, or inline-able function implementations, you'll run into problems and the tests should be rebuilt, so use good judgment and always rebuild, relink, and rerun all unit tests before a commit!

Also, if you use this "make -t check" method, your build tree may become out-of-sync with itself. If you see strange behavior (unexpected segfaults, weird debugger behavior, etc.), rebuild and relink all tests.

Regression testing

Regression tests in CVC4 are tests of the CVC4 binary.

To add a regression test, drop the file (which should have extension .smt or .cvc in test/regress/regressN/ (for N the regression level it should be in) and add it under TESTS to test/regress/regressN/Makefile.am. Then add it to subversion:

 $ svn add test/regress/my_test.cvc

In the case of a .cvc benchmark, add one or more % EXPECT: output lines in the file (either at the top or scattered throughout). This is the expected output of CVC4 (running without -q or -v or debugging or anything). Any differences in output from the expected are flagged as test errors. See other .cvc regression tests for examples. If there are no EXPECT lines, the run_regression script flags an error. An EXIT line indicates the expected exit code of CVC4, but is optional; if omitted, CVC4 may have any exit status.

In the case of an .smt benchmark, make sure it has :status sat or :status unsat. This is used to determine if the output is correct and if CVC4 exits with the correct exit code.

This should be all that's necessary. Before committing to the repository, you should re-run autogen.sh in the root directory so that your commit includes the newest Makefile.in.

Multi-level regression testing

The directories

  • test/regress/regress0
  • test/regress/regress1
  • test/regress/regress2
  • test/regress/regress3

contain benchmarks at different regression levels. More difficult or longer-running benchmarks are typically placed in higher regression levels.

A simple make check at the top-level source directory (or top-level build or test source or build directories) does regression testing at level 0. For a specific level, use make regressN: this runs all tests at levels less than or equal to N. (At the top-level, the regressN target implies the check target, so you can make regress3 to run all unit tests, system tests, and also regressions at all levels.)

To run a specific level of regressions, e.g., regressions at level 2, without running regressions at lower levels, run a make check inside that specific directory. For example:

 $ make -C test/regress/regress2 check

Running the full regression suite on the cluster

Experimental CVC4 binaries can be submitted to the cluster for a full run of the regression suite [2]. The test daemon will send you an email with a link to the results when the jobs is finished running.

You may want to refer to the section on working with the cluster (above).

Adding command line arguments to the regression tests

Use the CVC4_REGRESSION_ARGS flag on the commandline. For example,

 make -j4 check CVC4_REGRESSION_ARGS="-d arith::consistency"

System testing

System, or API, testing is a form of regression testing on CVC4 libraries rather than the driver. Rather than .cvc and .smt instances as input, a system test is a source file that is built and linked against the cvc4 libraries and uses its API, bypassing the usual CVC4 driver. These tests are located under test/system/.

FIXME expand this section.

Command-line options

To add a new command-line option to CVC4, open up src/main/getopt.cpp. Add an entry to the cmdlineOptions array:

  • the first item is the long name of the command-line option. For example, verbose for --verbose.
  • the second item is required_argument, optional_argument, or no_argument, depending on whether the option takes an argument, an optional argument, or does not take an argument.
  • the third item should be NULL.
  • the last item should be the equivalent short option (a char), or a member of the OptionValue enumeration if there isn't an equivalent short option.

Next, if you don't have an equivalent short option, add the appropriate item to the OptionValue enumeration. If you do have an equivalent short option, append the short option to the string passed to getopt_long() in parseOptions(). Your option character should be followed by a single colon if an argument is required, or two colons if an argument is optional.

Next, modify the switch() construct in parseOptions() to handle your short option character (or the value from the OptionValue enumeration). If you need a new field in the CVC4::Options structure, add it in src/smt/options.h.

Finally, change the usage string in src/main/usage.h.

NodeBuilders

FIXME. ADD CONTENT.

Attributes

Attributes are roughly speaking [almost] global hash tables from Nodes (TNodes) to data. Attributes can be thought of as additional fields used to extend NodeValues. Attributes are mutable, and come in both sat context-dependent and non-context dependent varieties. Attributes live only as long as the node itself does. If a Node is garbage-collected Attribute's associated with it will automatically be garbage collected. (Being in the domain of an Attribute does not increase a Node's reference count.) To achieve this special relationship with Nodes, Attributes are mapped by hash tables (AttrHash<> and CDAttrHash<>) that live in the AttributeManager. The AttributeManager is owned by the NodeManager. The following types are currently allowed:

  • bool
  • uint64_t
  • TNode
  • Node
  • std::string
  • T*

To get the value of an Attribute InstLevelAttribute on a Node n, use

   n.getAttribute(InstLevelAttribute())

Declaring Attributes

Attributes tend to be defined in a fixed pattern:

   struct InstLevelAttributeId {};
   typedef expr::Attribute<InstLevelAttributeId, uint64_t> InstLevelAttribute;

To separate Attributes of the same type in the same table each of the structures struct InstLevelAttributeId {}; is given a different unique value at load time. An example is the empty struct InstLevelAttributeId. These should be unique for each Attribute. Then via some template messiness when InstLevelAttribute() is passed as the argument to getAttribute(...) the load time id is instantiated. (To get the work done, the ID structs these just need to be different for each AttrHash. But if you do not use unique ones, we won't accept this into trunk!)

More complicated T* data-structures may require a clean up strategy. By default, NullCleanupStrategy is used where nothing happens. This allows for [1] to be called on a pointer data-structure for example. Formally, the full specification of an attribute is:

  expr::Attribute< Id, value_type, CleanupStrategy, Context-Dependent>
  • Id is a compiler object to determine the load time id (normally a unique empty struct)
  • value_type is the domain type of the Attribute (bool, T*, ...)
  • CleanupStrategy specifies how value_types need to be garbage collected when removed
  • Context-Dependent is a bool that if true means the attribute is context-dependent and is false otherwise.

As a convenience, you can also define context dependent attributes without a cleanup strategy using

 struct CDAttribute<class T, value_type>

Here is a declaration that makes full use of every parameter:

 // Attribute that tell if a node have been asserted in this branch
 struct AvailableInTermDbId {};
 /** use the special for boolean flag */
 typedef expr::Attribute<AvailableInTermDbId,
                       bool,
                       expr::attr::NullCleanupStrategy,
                       true  // context dependent
                       > AvailableInTermDb;


Attribute Internals

To map to more complicated data-structures T, use pointers T* as the domain type of the attribute. These are then handled as void* internally by the Attribute system. The following hash maps are defined:

  • AttrHash<bool> d_bools;
  • AttrHash<uint64_t> d_ints;
  • AttrHash<TNode> d_tnodes;
  • AttrHash<Node> d_nodes;
  • AttrHash<TypeNode> d_types;
  • AttrHash<std::string> d_strings;
  • AttrHash<void*> d_ptrs;
  • CDAttrHash<bool> d_cdbools;
  • CDAttrHash<uint64_t> d_cdints;
  • CDAttrHash<TNode> d_cdtnodes;
  • CDAttrHash<Node> d_cdnodes;
  • CDAttrHash<std::string> d_cdstrings;
  • CDAttrHash<void*> d_cdptrs;

In principle, though user context level context dependent attributes are not currently supported, these can be added without too much difficulty.

To dig into details, take a look at expr/{attribute.h, attribute_internals.h, attribute.cpp, node_manager.h, node_manager.cpp}

Common Attribute Errors

What does this mean?

/home/mdeters/cvc4/builds/x86_64-unknown-linux-gnu/debug/../../../src/expr/attribute.h: In member function �$-1òøvoid CVC4::expr::AttributeManager::setAttribute(const CVC4::Node&, const AttrKind&, const typename AttrKind::value_type&) [with AttrKind = TestFlag6]òù:
/home/mdeters/cvc4/builds/x86_64-unknown-linux-gnu/debug/../../../src/expr/node_manager.h:128:   instantiated from �$-1òøvoid CVC4::NodeManager::setAttribute(const CVC4::Node&, const AttrKind&, const typename AttrKind::value_type&) [with AttrKind = TestFlag6]òù
/home/mdeters/cvc4/builds/x86_64-unknown-linux-gnu/debug/../../../src/expr/node.h:291:   instantiated from �$-1òøvoid CVC4::Node::setAttribute(const AttrKind&, const typename AttrKind::value_type&) [with AttrKind = TestFlag6]òù
/home/mdeters/cvc4/builds/x86_64-unknown-linux-gnu/debug/../../../test/unit/expr/node_white.h:113:   instantiated from here
/home/mdeters/cvc4/builds/x86_64-unknown-linux-gnu/debug/../../../src/expr/attribute.h:753: error: invalid use of undefined type �$-1òøstruct CVC4::expr::getTable<FooBar>òù
/home/mdeters/cvc4/builds/x86_64-unknown-linux-gnu/debug/../../../src/expr/attribute.h:508: error: declaration of �$-1òøstruct CVC4::expr::getTable<FooBar>òù
/home/mdeters/cvc4/builds/x86_64-unknown-linux-gnu/debug/../../../src/expr/attribute.h:755: error: invalid use of undefined type �$-1òøstruct CVC4::expr::getTable<FooBar>òù
/home/mdeters/cvc4/builds/x86_64-unknown-linux-gnu/debug/../../../src/expr/attribute.h:508: error: declaration of �$-1òøstruct CVC4::expr::getTable<FooBar>òù
/home/mdeters/cvc4/builds/x86_64-unknown-linux-gnu/debug/../../../src/expr/node_manager.h:128:   instantiated from �$-1òøvoid CVC4::NodeManager::setAttribute(const CVC4::Node&, const AttrKind&, const typename AttrKind::value_type&) [with AttrKind = TestFlag6]òù
/home/mdeters/cvc4/builds/x86_64-unknown-linux-gnu/debug/../../../src/expr/node.h:291:   instantiated from �$-1òøvoid CVC4::Node::setAttribute(const AttrKind&, const typename AttrKind::value_type&) [with AttrKind = TestFlag6]òù
/home/mdeters/cvc4/builds/x86_64-unknown-linux-gnu/debug/../../../test/unit/expr/node_white.h:113:   instantiated from here
/home/mdeters/cvc4/builds/x86_64-unknown-linux-gnu/debug/../../../src/expr/attribute.h:755: error: invalid use of undefined type �$-1òøstruct CVC4::expr::getTable<FooBar>òù
/home/mdeters/cvc4/builds/x86_64-unknown-linux-gnu/debug/../../../src/expr/attribute.h:508: error: declaration of �$-1òøstruct CVC4::expr::getTable<FooBar>òù
make[5]: *** [expr/node_white] Error 1

It means that the Attribute value-type (in this case FooBar), is invalid; there's no hashtable of appropriate kind to put it in. You probably want FooBar*.

ANTLR3

To get ANTLR3, which isn't yet distributed as a Debian/Ubuntu package:

  1. Go to the ANTLR3 page and download Complete ANTLR 3.4 jar from http://antlr.org/download/antlr-3.4-complete.jar.
  2. Make a directory $ANTLR3_DIR.
  3. Save the following shell script file in the $ANTLR3_DIR directory (make sure the file is executable), substituting the correct fully-qualified path for "$ANTLR3_JAR":
    #!/bin/bash
     
    export CLASSPATH=$ANTLR3_JAR
    exec java org.antlr.Tool "$@"
    
  4. Now, you are done with installing Antlr3.
  5. Download "libantlr3c-3.4.tar.gz" from http://antlr.org/download/C and extract the archive. You now have an extracted src directory called "libantlr3c-3.4".
  6. Make a directory $LIBANTLR3C_INSTALL_DIR.
  7. Enter the src directory.
  8. If you are on a 32-bit machine, run
    ./configure --disable-antlrdebug --prefix=$LIBANTLR3C_INSTALL_DIR
    

    If you are on a 64 bit machine, run

    ./configure --enable-64bit --disable-antlrdebug --prefix=$LIBANTLR3C_INSTALL_DIR
    
    --prefix=PREFIX specifies the library installation directory. This defaults to /usr/local.
  9. Open up the Makefile that is generated at the top-level source directory and add -fexceptions to the CFLAGS setting.
  10. If you're on Solaris, you may need to change the Makefile to add "-lnsl -lsocket" to the libantlr3c_la_LDFLAGS definition.
  11. Run
    make
    make install
    

    or

    sudo make install
    
  12. Note the following installation information for libantlr3c (where LIBDIR is $LIBANTLR3C_INSTALL_DIR):
    If you ever happen to want to link against installed libraries
    in a given directory, LIBDIR, you must either use libtool, and
    specify the full pathname of the library, or use the `-LLIBDIR'
    flag during linking and do at least one of the following:
       - add LIBDIR to the `LD_LIBRARY_PATH' environment variable
         during execution
       - add LIBDIR to the `LD_RUN_PATH' environment variable
         during linking
       - use the `-Wl,--rpath -Wl,LIBDIR' linker flag
       - have your system administrator add LIBDIR to `/etc/ld.so.conf'
    
  13. cd to the CVC4 source tree
  14. Now run
    ./autogen.sh
    

    and

    ./configure --with-antlr-dir=$LIBANTLR3_INSTALL_DIR ANTLR=$ANTLR3_DIR/antlr3"
    

    , where $LIBANTLR3_INSTALL_DIR is the path of the library installation directory, and $ANTLR3_DIR/antlr3 is the path of the shell script you made.

If you are still having problems trying to configure antlr3 with cvc4, take a look at

cvc4/builds/$ARCH/$BUILD_MODE/config.log

and double check that your architecture is is 32 versus 64bit:

$ uname -a
Linux box38.cims.nyu.edu 2.6.18-164.15.1.el5 #1 SMP Mon Mar 1 10:56:08 EST 2010 x86_64 x86_64 x86_64 GNU/Linux

For systems that mix 32 bit and 64 bit libraries, you need to make sure that the ELF classes of the libraries being linked agree with each other. For example if gmp is 32 bit and libantlr3c is 64 bit, you are likely to get an ELFCLASS64 error at configure time. To determine if this is the case, search through the appropriate config.log file, e.g.:

 builds/x86_64-unknown-linux-gnu/default/config.log

Adding theory support to the parser

To add support for a new theory to the SMT parser, you will need to edit src/parser/smt/Smt.g:

  1. Add any new operators to the list of operator tokens near the end of file (look for AND_TOK, etc).
  2. Add the new operator as an alternative in the rule builtinOp, mapping it to its corresponding CVC4 Kind.
  3. Add the rules to match new constant tokens near the end of the file (look for NUMERAL_TOK, etc).
  4. Add the new constants as alternatives in the rule annotatedFormula, decoding the token text and calling MK_CONST (which is just a macro that calls ExprManager::mkConst). NOTE: if you have a constant token type MY_CONST_TOK, you can get a std::string representing the matched input using the static method AntlrInput::tokenText, e.g.,
      // annotatedFormula alternatives ...
      | MY_CONST_TOK 
        { std::string myConstText = AntlrInput::tokenText($MY_CONST_TOK);
          MyConstType c = ...;
          expr = MK_CONST(c); }
    

Building a 32-bit binary on a 64-bit machine

(These instructions are for Debian/Ubuntu)

First, make sure you have the 32-bit library dependencies installed

   sudo apt-get install libc6-dev-i386 g++-multilib lib32gmpxx4 lib32gmp3 lib32gmp3-dev

You will also need to build a 32-bit version of the ANTLR library. Just configure the library with --disable-64bit. [I found that I had to first configure and build with --enable-64bit, then reconfigure and build with --disable-64bit to get it to work. -Chris]

Configure CVC4 with CFLAGS=-m32 CXXFLAGS=-m32 and build as usual.

Working with Contexts

Contexts are a particularly complicated part of the machinery of CVC4, and much of CVC4's work hinges on them.

The class "Context" is an abstraction for a rollback-able memory. One can create a Context:

 Context c;

and some objects in it (use the CDO<> wrapper to make a simple, rollback-able object, using a pointer to the Context as the constructor argument):

 CDO<int> i(&c);
 CDO<int> j(&c);
 i = 4;
 j = 3;

One then "pushes a scope" on the Context:

 c.push();

Further edits to i and j are then rolled back with pop():

 i = 10000;
 j = -5;
 assert(i == 10000 && j == -5); // succeeds
 
 c.pop();
 
 assert(i == 4 && j == 3); // also succeeds

Types wrapped by CDO<> simply employ a copy-on-write policy if the current scope is greater than that in which the object was last updated. Rollbacks are eager (when pop() is called). CDO<> just uses the underlying type's copy constructor. Sometimes you want something more involved, in which case you write your own class that extends the class ContextObj and provides save and restore functionality. Make sure to read the invariants in context.h carefully!

Gotchas

Here are some important gotchas to keep in mind:

  1. Your ContextObj subclass must call destroy() from its destructor. This ensures that, if prematurely destructed (that is, before the Context goes away), your object is properly unregistered from the Context.
  2. Be careful if you allocate in an alternative memory manager (see below).
  3. FIXME add more gotchas

Using ContextObj that aren't allocated in usual heap memory

You need to be careful; even at a high scope level, all objects registered for backtracking with a Context are inserted at scope zero. This means your program will crash if you delete the associated memory too soon. Now re-read that sentence. If you create a CDO<int> (with new) at scope level 10, and delete it upon backtracking to scope level 9, the CDO<int> destructor properly unregisters itself from the Context. However, if you allocate it in a memory manager (such as the ContextMemoryManager) that doesn't guarantee that the destructor is called, your program will crash (perhaps much later), because the object is still registered at scope level 0 (and thus a bad pointer appears on a list). You may prefer to create a CDO<int*> rather than a CDO<int>*, which solves the issue (you can prematurely delete the int* without incident). But that may leak memory, so there are additional constructors in ContextObj and CDO<> (those starting with a bool argument) that specifically "register" the object at the current scope level. They aren't registered at scope level 0, so the problem doesn't occur. However, you have to be careful.

But why allocate ContextObj instances with the ContextMemoryManager if it's so complicated? There's a big advantage: you don't have to track the lifetime of the ContextObj. The object simply ceases to exist when the Context is popped. Thus you can create an object in context memory, and if you stow pointers to it only in context memory as well, all is cleaned up for you when the scope pops. Here's a list of gotchas:

  1. For data structures intended solely to be allocated in context memory, privately declare (but don't define) an operator new(size_t) and destructor (as currently in the Link class, in src/theory/uf/tim/ecdata.h).
  2. For data structures that may or may not be allocated in context memory, and are designed to be that way (especially if they contain ContextObj instances), they should be heavily documented---especially the destructor, since it may or may not be called.
  3. There's also an issue for generic code---some class Foo<T> might be allocated in context memory, and that might normally be fine, but if T is a ContextObj this requires certain care.
  4. Note that certain care is required for ContextObj inside of data structures. If the (enclosing) data structure can be allocated in ContextMemoryManager ("CMM"), that means the (enclosed) ContextObj can be, even if it was not designed to be allocated in that way. In many instances, it may be required that data structures enclosing ContextObj be parameterized with a similar "bool allocatedInCMM" argument as the special constructor in this class (and pass it on to all ContextObj instances).
    1. delete