Difference between revisions of "Developer's Guide"

From CVC4
Jump to: navigation, search
(Preprocessor macros: revising based on 12/1 dev meeting)
Line 31: Line 31:
 
==Preprocessor macros==
 
==Preprocessor macros==
  
Preprocessor macros are <code>ALWAYS_UPPERCASE</code>. For a binary #define (something that might be on or off) with a ''noun''-like name, either #define it or leave it undefined. Don't define it to have a value, but rather just define it to be empty:
+
Preprocessor macros should be avoided when possible.
 +
 
 +
Preprocessor macro names are <code>ALWAYS_UPPERCASE</code>.
 +
 
 +
=== Binary macros ===
 +
 
 +
A binary macro is a symbol that has two states. There are two kinds of binary macros:
 +
 
 +
* A macro that is either "defined" or "undefined." These macros should have names that are nouns or noun phrases. The should not be defined to have a value&mdash;their definition should be empty. For example:
 
   #define DEBUG
 
   #define DEBUG
and later:
+
These macros can be tested using the <code>#ifdef</code> directive:
 
   #ifdef DEBUG
 
   #ifdef DEBUG
 
   /* ... */
 
   /* ... */
 
   #endif /* DEBUG */
 
   #endif /* DEBUG */
  
For binary #defines with ''verb''-like names, #define it to have the value 0 or 1:
+
* A macro that has one of the values 0 or 1. These macros should have names that are verbs or verb phrases. For example:
 
   #define USE_MINISAT 1
 
   #define USE_MINISAT 1
When using this form, you're encouraged to use a language-level (rather than preprocessor-level) construct to condition behavior:
+
These macros should be tested using <code>if</code> statements rather than preprocessor directives:
 
   if(USE_MINISAT) {
 
   if(USE_MINISAT) {
 
     /* do something with minisat */
 
     /* do something with minisat */
Line 47: Line 55:
 
   }
 
   }
  
* Use <code>do { ... } while(0)</code> trick ''without'' semicolon for multiple statements, or else GNU's statement-expressions.
+
=== Block macros ===
* Avoid multiple evaluation.
+
 
 +
Macros that expand to multiple statements should use the <code>do { ... } while(0)</code>. For example, a macro <code>FOO</code> that generates the statements <code>S1; S2;</code> should be defined using
 +
    #define FOO do { S1; S2; } while(0)
 +
(Note that there is ''no semicolon'' after <code>while(0)</code>.)
 +
 
 +
=== Miscellaneous ===
 +
 
 +
* Since macro parameters may be expressions with side effects, they should not be used multiple times in a definition.
 +
 
 
* Remember these aren't hygienic macros!  You can capture names:
 
* Remember these aren't hygienic macros!  You can capture names:
 
   #define ADD(c, a, b) { int x = a, y = b; c = x + y; }
 
   #define ADD(c, a, b) { int x = a, y = b; c = x + y; }

Revision as of 18:27, 1 December 2009

Source tree layout

  • config - this directory holds m4 macro processor files (autoconf code) as well as bits of the autotools build system that go out as part of the distribution to smooth over platform differences.
  • contrib - this directory includes maintainer scripts and other things that aren't directly part of CVC4
  • doc - documentation
  • src/context - context manager, context-dependent object/collection sources
  • src/expr - the expression package
  • src/include - public includes (installed in /usr/local or whatever on a users' machines)
  • src/main - source for the main cvc4 binary
  • src/parser - parsers for supported CVC4 input formats
  • src/prop - propositional parts of CVC4; these include imported minisat sources and the PropEngine
  • src/smt - SmtEngine
  • src/theory - the TheoryEngine and all theory solvers
  • src/util - utility classes, debugging macros, the Exception class, etc.
  • test - CxxTest unit tests (invoked by make check)

Coding guidelines

The following sections attempt to standardize C++ coding and organizational style for the CVC4 project.

Note that many of these standards apply to unit tests under test/ as well, but many do not.

The guiding philosophy in the coding guidelines for CVC4 is to make the code readable and consistent throughout without getting in the way by being overly restrictive. Flexibility is afforded in edge cases where the developer should make his or her best effort to satisfy the guidelines, but not at the expense of making the code less readable or formatted poorly or insensibly.

File and directory names

  • File names are all lowercase. Class FooSolver has its definition in foo_solver.h and its implementation in foo_solver.cpp.
  • File extensions .cpp and .h are generally preferred, with .ypp and .lpp for yacc and lex inputs---an exception is made for headers when other tools have clashing preferences for generated sources; e.g., autotools wants to output parser definitions for the presentation language as pl.hpp.

Preprocessor macros

Preprocessor macros should be avoided when possible.

Preprocessor macro names are ALWAYS_UPPERCASE.

Binary macros

A binary macro is a symbol that has two states. There are two kinds of binary macros:

  • A macro that is either "defined" or "undefined." These macros should have names that are nouns or noun phrases. The should not be defined to have a value—their definition should be empty. For example:
 #define DEBUG

These macros can be tested using the #ifdef directive:

 #ifdef DEBUG
 /* ... */
 #endif /* DEBUG */
  • A macro that has one of the values 0 or 1. These macros should have names that are verbs or verb phrases. For example:
 #define USE_MINISAT 1

These macros should be tested using if statements rather than preprocessor directives:

 if(USE_MINISAT) {
   /* do something with minisat */
 } else {
   /* some non-minisat default behavior */
 }

Block macros

Macros that expand to multiple statements should use the do { ... } while(0). For example, a macro FOO that generates the statements S1; S2; should be defined using

   #define FOO do { S1; S2; } while(0)

(Note that there is no semicolon after while(0).)

Miscellaneous

  • Since macro parameters may be expressions with side effects, they should not be used multiple times in a definition.
  • Remember these aren't hygienic macros! You can capture names:
  #define ADD(c, a, b) { int x = a, y = b; c = x + y; }
  int main() {
    int x = 5;
    // print out the value x + x ^ 2 (should be 30)
    int z;
    ADD(z, x, x * x);
    printf("%d\n", z); // does NOT print 30  :-(
  }

The usual way to combat this problem is to prefix the names in the macro with underscores:

  #define ADD(c, a, b) { int __x = a, __y = b; c = x + y; }

Though it's best to use longer names and perhaps to mix in the name of the macro as well, to avoid any possible clash.

  • Except when you're playing clever/stupid tricks with macro expansion, you should guard against unwitting order-of-operations surprises by parenthesizing arguments when expanded in macro bodies:
  #define ADD(c, a, b) { int __add_x = (a), __add_y = (b); (c) = x + y; }

This approximates (in this case) macro-call-by-value.

  • There is an exception to the ALWAYS_UPPERCASE rule for CVC4::Assert() and friends.

Class names

  • Classes, and type names in general, are in CamelCase. There are exceptions to this rule, however, where lower case names with embedded underscores are acceptable:
    • STL type names are not CamelCase, and when implementing traits and STL-like typedefs (iterators for instance), you should match the STL style as closely as possible:
 class Foo {
   class Item;
   typedef Item*       iterator;
   typedef const Item* const_iterator;
 };
    • for low-level exception types (similar to STL's bad_alloc). However, for derived classes of CVC4::Exception, the names should be CamelCase.
    • Low-level unions and structs may be in a similar lower case form, e.g., low_level_struct.
  • Note that acronyms (SMT, CVC4) inside type names should generally be lowercase after the first letter, such as in CVC4::SmtEngine. This is preferred to SMTEngine or SMT_Engine, but use judgment on a case-by-case basis.

Namespaces

  • Everything is in the CVC4 namespace or a sub-namespace. Generally, the sub-namespace follows the module (and thus the source directory) structure. The name remains uncapitalized, and acronyms are all in lowercase letters (e.g. CVC4::smt, CVC4::main, CVC4::parser). However, certain base types go in CVC4 directly instead of a sub-namespace despite belonging to a module. These are user-visible (such as CVC4::Exception), core functionality classes, and utility classes from src/util/.

Member names

  • Data members are prefixed with d_, and are generally in lowerCamelCase after that. For example:
 class Foo {
   int d_thisIsAnInt;
 };
  • Static members are prefixed with s_:
 class Foo {
   static int s_thisIsAStaticInt;
 };
  • Union elements are prefixed with u_.
  • There is an exception to these rules for low-level struct types that have no member functions and no destructors (and only simple, initializing constructors). For an example see the definition of CVC4::Options in src/util/options.h.

Communicating with the user: the output classes

ALL OUTPUT through Debug and Trace and Notice and Warning functionality please! Very easy:

 #include "util/output.h"
 CVC4::Debug("arith", "foo!"); // prints "foo!" if arith debugging is on
 CVC4::Warning("Equivalence classes didn't get merged."); // prints unless user gave -q or in SMT-COMP mode
 CVC4::Notice("Hi, so you don't get bored, here are some statistics: " + stats); // if user gives -v
 CVC4::Chat("I feel like doing some theory propagation.");// if user gives -vv
 CVC4::Trace("arith", "<ralph>I'm now calling a function!</ralph>");// if user gives -vvv

Chatting in particular should be useful for tuning/strategy things. Parts of the code that aren't conditional on particular settings in CVC4 should avoid Chat and use Trace or Debug. Tracing is tagged with the module to which it's relevant. The above add a \n for you. If you don't want that, you can use a printf() interface:

 CVC4::Debug::printf("arith", "foo!"); // no line termination

or the stream interface:

 CVC4::Debug("arith") << "foo!"; // also no line termination

For non-debug builds, CVC4::Debug is a no-op and everything will be inlined away.

Assertions

CVC4 Assertions:

 #include "util/assert.h"
 CVC4::Assert(condition); // asserts condition is true
 CVC4::Assert(condition, "error message on fail"); // asserts condition is true, custom error message

Note that Assert is a macro (in order to collect source line and file and expression-string information from the C preprocessor). A CVC4::AssertionException is thrown if it fails. When assertions are turned off, this is a no-op and everything is inlined away.

The above assertions can be turned off (with --disable-assertions at configure-time). You might want to have an assertion be checked even if assertions are off. For example:

  1. Unreachable code. If you have a section of code that is intended to be unreachable (you have, perhaps, proved on paper that the code should never enter this state), it costs nothing to check the assertion (it's essentially an assert(false)---always fail).
  2. Highly-compromised state. In cases where a wrong answer is unlikely and the assertion check is cheap and infrequently-run, you may consider leaving an assertion in. This is the case also with the above unreachable-code assertion.

For unreachable code, use the Unreachable() macro. It will throw a CVC4::UnreachableCodeException (a subclass of CVC4::AssertionException):

 #include "util/assert.h"
 CVC4::Unreachable(); // flags (supposedly) unreachable code; fails even under --disable-assertions
 CVC4::Unreachable("error message"); // custom error message

Another special case is an unhandled case (for example, a default block in a switch that was designed to handle all cases). It will throw a CVC4::UnhandledCaseException (a subclass of CVC4::UnreachableCodeException):

 #include "util/assert.h"
 CVC4::Unhandled(); // flags an unhandled case; fails even under --disable-assertions
 CVC4::Unhandled(the_case); // same but prints out the_case that's unhandled as well (which should be whatever is in the switch)
 CVC4::Unhandled("error message"); // custom error message
 CVC4::Unhandled(the_case, "error message"); // custom error message with the_case that is unhandled

For a strong assertion that's checked even with --disable-assertions, use an AlwaysAssert(). It throws a CVC4::AssertionException just as Assert() does.

 #include "util/assert.h"
 CVC4::AlwaysAssert(condition);
 CVC4::AlwaysAssert(condition, "error message");

Doxygen comments

Exporting library symbols

When built with a GNU toolchain, libcvc4 hides all symbols by default (with -fvisibility-hidden). This is a good way to ensure that the public interface is complete, that no undocumented interface is used by anyone, and it also permits more aggressive optimization of the library. Those symbols intended to be the public interface to CVC4, however, must be explicitly exported.

Fortunately, this isn't difficult. To use a symbol, a user should have a header file anyway. Only a subset of CVC4 headers are installed with make install, and these all live in src/include/ and comprise the complete public interface. Public symbols are marked with CVC4_PUBLIC:

 class CVC4_PUBLIC Expr {
   /* ... */
 public:
   Expr();
 };

Note here that types and private members need not be marked CVC4_EXPORT.

Now. What do you care? Well, if you ever forward-declare something that's public, you have to mark it PUBLIC in the forward-declaration too. (Is this true??)

Exception classes should all be marked CVC4_PUBLIC, because they might end up in user code (external to libcvc4) and their type_info information must be available to the catch block in user code. If you don't think your exception can (nor should by design) exit the library and end up in user code, then first double-check this, then triple-check it, then document that fact clearly at call site, the catch site (inside the library), and at the exception class definition.

See http://gcc.gnu.org/wiki/Visibility for more information on visibility.

Source file layout

Class header files

  • A class named ThisIsAClass has its definition in this_is_a_class.h. That header file is guarded by the preprocessor macro __CVC4__THIS_IS_A_CLASS_H if it's in the CVC4 namespace. If it's in a sub-namespace Foo, then its preprocessor macro guard is __CVC4__FOO__THIS_IS_A_CLASS_H. Note the two underscores for each :: and the two prefix underscores. One underscore is used between words in a type or namespace name.
  • Public headers (in src/include/), use preprocessor macro guard __CVC4_headername_H, for example, __CVC4_EXPR_H for cvc4_expr.h. Note here the one underscore between CVC4 and the name.
  • Other (non-class) header files should use a similar preprocessor macro guard, based on relevant module and header file name, as class header files do. In all cases, prefix the name of the guard with __CVC4_ and design it to be unlikely that it will ever conflict with any other (future) CVC4 header, whether it might be for internal use only or externally provided to a library user.
  • Everything should be in the CVC4 namespace, or a sub-namespace of CVC4 (based on module).
  • For test classes. Unit test classes (under test/) are implemented as header files and do not need to be in a namespace, nor guarded by a preprocessor macro. The idea is to keep them as simple and clean as possible, so the tests can be easily read. They are special-purpose and compiled and linked individually, so they don't need these protections.

Class implementation files

  • A class named ThisIsAClass should have its implementation under src/ in this_is_a_class.cpp in the relevant module directory.

Imported sources

  • For imported sources. Make sure to add an exclusion to the contrib/update-copyright.pl script so that a CVC4 copyright isn't added when the script is run.

Source file headers

  • A common comment blurb heads all sources in CVC4 (including lex and yacc inputs and unit tests, but excluding imported sources).
  • A script, update-copyright.pl is used to maintain/update this comment blurb.

update-copyright.pl

  • You can run contrib/update-copyright.pl to update all sources with a new copyright template. (Some) efforts are made to throw away old boilerplate comment blurbs while keeping file-specific comments.
  • PLEASE BE CAREFUL. This updates files in place without keeping a backup. The script is probably buggy. You have been warned. It's best to do this on a copy of your source tree, or on a fresh repository checkout. (The program will warn you too.)
  • Please consult svn diff before committing changes made by the script.
  • You can run the update script on specific files/directories by providing arguments to the script. By default it processes all sources in the src/ and test/ directories.

Using emacs

  • There's an emacs-lisp snippet available that you can drop in your ~/.emacs to conform to spacing, indentation, etc., in the CVC4 source tree.
  • See contrib/editing-with-emacs.

Textual source layout

Whitespace/tabs

  • No tabs, please.
  • No whitespace between if, for, while, 'catch', etc., and their opening parenthesis.
  • One space between closing paren and open curly brace (except if aligning to other nearby source lines).
    • The matching right curly brace goes on a line by itself (possibly with an end-comment tagging the construct it's part of) in the same column as the first letter of the construct keyword. (That is, the right-brace lines up with the if, for, etc.)
  • One space around binary operators to avoid crowding, e.g.:
  x = y;
  z = y + 1;
  • In complicated arithmetic expressions, even when parentheses aren't required by operator order of evaluation, sometimes parentheses and extra spacing and alignment should be used for visual clarity, or to match more closely what is being implemented, what is described in associated source code comments, or both. For example:
  if( ( ( x1 + y1 ) - ( z  * 2 ) ) <=  4 ||
      ( ( x2 + y2 ) - ( zp * 2 ) ) >= -4 ) { ... }
  • Occasionally to emphasize a !, so it doesn't get visually lost, it's best followed or even surrounded by space. For example, this is bad:
  if(!(x + y > 4 && z < x)) ...
  • while this is much preferred, since it emphasizes the negation of the entire condition:
  if( !( x + y > 4 && z < x ) ) { ... }
  • (assuming that it's preferable not to rewrite this as x + y <= 4 for some other reason of course).
  • Between type names and */& when declaring members and variables, to emphasize the introduced name is a pointer:
 Foo *a, *b;
  • NOT between type names and */& when in an argument list or return type, to emphasize that the type is a pointer:
 Bar* fooToBar(Foo* f);
  • There is an exception for array types. Since char *argv[] is actually an array of pointers, it doesn't make sense to attach * to the type in this case (in C/C++, the [] must come after the formal name and can't be attached to the type name, as it can be in Java). In particular:
 int main(int argc, char *argv[]);

is how the signature for the main function is written.

  • No trailing spaces please. M-x delete-trailing-whitespace in emacs might help.

Increment/Decrement

  • Preincrement/decrement preferred over postincrement/-decrement, e.g.:
  for(int x = 0; x < 10; ++x) { ... }

Bracing

  • Braces are not required around one-line blocks, but preferred in complicated cases, especially with if (to avoid visually-ambiguous elses).
    • E.g. this is fine:
   if(extra_output)
     do_something();
    • But this is debatable (preferring braces on the if() but not the for():
   if(extra_output)
     for(int i = 0; i < 10; ++i)
       do_something();
    • And this is horrible:
   if(extra_output)
     do_something_simple();
   else
     while(!done)
       for(int i = 0; i < 10; ++i)
         if(a[i] < 5)
           a[i + 10] += a[i];
  • Generally, if the then part of an if is complicated and braced, but the else part is simple, the else need not be braced but may be. But if the else part is complicated and braced then the then part should be (even if it's simple).
  • If in doubt, brace.
  • If anything is visually ambiguous, brace for clarity.

Indentation

  • Indentation level is 2.
  • No indentation under switch() {} or namespace {}
    • An exception is forward-declarations inside namespaces, where you do indent. For instance, from src/include/cvc4_expr.h:
  namespace CVC4 {
  
  namespace expr {
    class ExprValue;
  }/* CVC4::expr namespace */
  
  class Expr {
    /* ... */
  }/* class Expr */
  
  }/* CVC4 namespace */
  • Goto labels, case labels, and "private/public/protected" are recessed 2.
  • template <class T> generally belongs on its own line before a function declaration or definition. An exception is inside a class file where there's a long string of simple functions that are best not separated by vertical space.
  • If something is templated in this way, the entire construct must be vertically separated from its surroundings with blank lines, and the template <> specification is on its own line.
  • Attempt to keep lines under 80 columns wide. Though if this makes things excessively ugly or confusing, it's probably ok to go over 80.
  • Indentation level is 2 for preprocessor directives as well, but the leading # is always in the first column. For example:
 #if A
 #  if B
 #    warning A B case
 #  else /* ! B */
 #    warning A -B case
 #  endif /* B */
 #else /* ! B */
 #  if B
 #    warning -A B case
 #  else /* ! B */
 #    warning -A -B case
 #  endif /* B */
 #endif /* A */

Misc C++

  • int foo(void) -- (void) unnecessary
  • An array reference a[i] in C++ may also be reversed as i[a], due to a symmetric definition of operator[] when applied to array and integral types in C. In CVC4, array indexing should be done in the standard way.
  • never use "using" in header files
  • enums require operator<<

Intentionally not covered

  • 0 vs NULL

Use of autotools: automake, autoconf, autoheader, libtool

  • Please don't check into the repository things generated by automake, autoconf, or autoheader, for example: config.h.in, Makefile.ins, Makefiles, ylwrap/libtool/aclocal.m4/missing/config.guess, configure, etc.

Commit policy

  • Before you commit, everything should be documented and your code should conform to the coding guidelines.
  • An automated bot will review code commits and complain loudly to you if it doesn't like your documentation or coding style.
  • Clark will review all code commits for content.
  • Others in the group will review Clark's commits for content.
  • Unit tests should be included to test any new functionality included in your commit. These tests should pass the gcov coverage assessment for CVC4.

Unit tests with CxxTest

There are two unit tests for modules:

  • those written by the main author of the module/piece of functionality (white-box)
  • those written by another group member to test the exported interface of the module (black-box)

Coverage testing with gcov

Every non-exceptional line of code should be tickled by a unit test.

Other coding guidelines (as yet uncategorized)

  • No using or using namespace in header files, please!
  • All exceptions thrown by CVC4 should be types publicly derived from CVC4::Exception, unless the exception is very low-level (similar to std::bad_alloc) and there's a compelling reason to avoid the CVC4 infrastructure (this is true, for instance, for non-exceptional signals that are thrown internally by CVC4 and never escape the library; it's preferable to avoid memory allocation in this case).
  • Traditionally long declaration blocks (classes, namespaces, #endifs) should have an associated comment, even when they are short:
 namespace CVC4 {
 class Foo {
   /* pages
    * and
    * pages
    * go
    * by
    */
 };/* class Foo */
 }/* CVC4 namespace */

Others (if, while, switch, function definitions, etc.) should be close-commented for clarity where necessary. No space exists between the last token of the declaration or definition and the start of the close-comment.

Always end with a comment that matches the declaration ("struct Foo", "class Foo", "enum Foo"), except for namespaces, where it's the "Foo namespace." Comment preprocessor macros similarly:

 #ifdef SOMETHING
 #  ifdef OTHERTHING
 #    error You can't do that.
 #  else /* !OTHERTHING */
 #    warning This is experimental code.
 #    define FOO(x) x
 #  endif /* OTHERTHING */
 #else /* SOMETHING */
 #  define FOO(x)
 #endif /* SOMETHING */

Note that the indentation level for preprocessor macros is 2, but that the leading # is always in the first column, and that the #else takes the negation of the condition as a comment (but #endif takes the non-negated one). Here, note that the else-comment and close-comment is preceded by a single space. In cases where these blocks are long and you want to include additional context information about what macros are true, please do:

 #ifdef SOMETHING
 /* lots of
  * code here
  */
 #else /* !SOMETHING */
 /* lots of
  * code here
  */
 #  ifdef OTHERTHING /* && !SOMETHING */
 /* code... */
 #  endif
 #endif
  • Code and preprocessor indentation levels need not match unless it makes sense for them to.
  • ALWAYS COMMENT FALL-THROUGH CASES IN switch() {}:
 switch(foo) {
 case THIS_ONE:
   printf("this!\n");
   /* fall through */
 case THAT_ONE:
   printf("that!\n");
 }

Vertical spacing

Vertical spacing depends on context. Logically group statements. Debugging bits should generally be offset from the rest of the code by blank lines. switch() {} blocks should generally leave a space between a case label and the preceding break.

Binary compatibility: what is it and why do we care?

Unit testing

 ./configure CXXTEST=~/cxxtest
  • Now, make check will run the CxxTest unit tests under test/.
  • To add a new test, create a test header file under test/ (possibly in a subdirectory based on the relevant CVC4 module and whether it's white/black-box testing). Then add it to TESTS in test/Makefile.am.
  • If your test is a white-box test (it calls private functionality of the tested class directly and was written knowing the implementation details), its name should include White, e.g., ExprWhite, and it should be in a file like test/expr/expr_white.h.
  • If your test is a black-box test (it calls only the public functionality of the tested class and was written knowing the interface only), its name should include Black, e.g., ExprBlack, and it should be in a file like test/expr/expr_black.h.