# TACAS'21 Artifact: Syntax-Guided Quantifier Instantiation
Aina Niemetz (1), Mathias Preiner (1), Andrew Reynolds (2), Clark Barrett (1),
and Cesare Tinelli (2)

(1) Stanford University, (2) The University of Iowa


This artifact contains the log files and data to replicate the experiments in
the paper "Syntax-Guided Quantifier Instantiation" accepted to TACAS'21.


## Artifact Directory Structure

### Files

* `cmpr.py`: Python script to extract, analyze, compare and print runtime data
  from solver runs.

* `generate-tables.sh`: Script to generate Tables 2, 3, and 4 from the paper.

* `install-packages.sh`: Script to install required Python packages in
  directory `packages`.

* `paper.pdf`: Revised version of the paper.

* `run-solver.sh`: Script to rerun experiments for a specific solver configuration.

### Directories

* `benchmarks`: Contains all SMT-LIB benchmarks used in our experiments
   (non-incremental logics `BV`, `BVFP`, `BVFPLRA`, `FP`, `FPLRA`, `LIA`,
   `LRA`, `NIA`, `NRA`, and `UFBV`).

* `binaries`: Contains the binaries for CVC4
  ([branch](https://github.com/CVC4/CVC4/tree/sygus-inst)) and Z3 (version
  4.8.9) that were used for the experiments in the paper.

* `data-table2`: Runtime data for comparison of term selection strategies (Table 2).

* `data-table3`: Runtime data for comparison of lemma selection strategies (Table 3).

* `data-table4`: Runtime data for comparison against other techniques (Table 4).

* `packages`: Contains Python packages downloaded via `pip3 download`, which
  are required for generating Tables 2-4 from the paper.


#### Structure of `data-tableN` directories

Each runtime data directory `data-tableN` contains subdirectories of runtime
data grouped by solver/configuration. In each configuration directory the
runtime data is grouped by SMT-LIB logic, e.g., `non_incremental_BV/`.
It further contains a file called `options`, which contains the command used
to run the specific configuration/solver.
For example, the options used for configuration `both-both` in Table 2 can
be found in `data-table2/both-both/options` and includes the following two
lines.

```
command:         cvc4 --no-e-matching --sygus-inst --no-cegqi --fp-exp --sygus-inst-scope=both --sygus-inst-term-sel=both --sygus-inst-mode=interleave --pre-skolem-quant
runexec:         --timelimit 300 --memlimit 8000MB
```

The first line is the command used to execute a configuration on a given
benchmark, whereas the second line specifies options passed to the tool
`runexec`, which we used to limit runtime and memory and
is part of the benchmarking and resource management framework
[`benchexec`](https://github.com/sosy-lab/benchexec).

The runtime data directories for each SMT-LIB logic are structured based on the
benchmark structure used in SMT-LIB. Our cluster setup produces two log files
per benchmark, which is stored as follows:

```
  <logic>/<family>/.../<benchmark>/{output.log,run.out}
```

* `output.log`: Contains the configuration/solver output printed to
  `stdout` and `stderr`.

* `run.out`: Contains the actual runtime data measured by `runexec`.


**Note**: The experiments were performed on a cluster with Intel Xeon E5-2620
CPUs (2.1GHz) and 128GB memory.


## Steps to generate Tables 2-4

1. **Install Python Packages**

  Generating the tables in the paper from the provided log files requires Python
  pandas. The following script will install the required Python dependencies from
  directory `packages`.  **Note**: The error and warning printed during package
  installation can be ignored.

  ```
  ./install-packages.sh
  ```

2. **Generating Tables 2, 3, and 4**

  ```
  ./generate-tables.sh
  ```

  This script will generate `table2.tex`, `table3.tex`, and `table4.tex`,
  which correspond to Tables 2, 3, and 4 in the paper.
  Note that layout of Table 4 in the paper was manually changed after
  generating the initial version.  The numbers however match the numbers in
  `table4.tex`.

  **Note (Changes after artifact review)**:
  1. The artifact now contains the data for two more solvers
     (Boolector, Vampire) in Table 4.
     The comparison against these two solvers was added for the camera-ready
     version of the paper and therefore was not part of the initial version of
     the artifact that was reviewed by the TACAS 2021 artifact evaluation
     committee.
  2. The artifact further includes updated numbers on 316 benchmarks for the
     solver Z3. In the initial experiments, Z3 had issues with filenames
     containing the character `=`. As a consequence, we had to rerun Z3
     on these benchmarks and updated the numbers in Table 4 accordingly.
     This change also happend after the artifact was reviewed by the TACAS 2021
     artifact evaluation committee.


## Replicating Data not in Tables

* In the paper we reported that the CVC4 configurations and Z3 disagree on 10
  benchmarks of logic `FP`, which is due to a known problem in Z3 related to
  operator `fp.rem`.
  To list the corresponding benchmarks use:

  ```
  ./cmpr.py data-table4/* --show-dis
  ```

* Uniquely solved instances for all four CVC4 configurations:

  ```
  ./cmpr.py -t data-table4/{cegqi,ematch,enum,syqi} -c uniq
  ```

* Uniquely solved instances between `cegqi` and `syqi` CVC4 configurations:

  ```
  ./cmpr.py -t data-table4/{cegqi,syqi} -c uniq
  ```

## Rerunning Experiments

Since rerunning the experiments on all benchmarks is not feasible in the
context of this artifact (required 3710 hours of CPU time on our cluster) we
list the steps to rerun selected benchmarks in the following.

The options used for CVC4 (`binaries/cvc4`) and Z3 (`binaries/z3`) can be
found in each configurations `options` file. For example, for configuration
`both-both` in Table 2, the options can be found
`data-table2/both-both/options`.

CVC4 can then be called on a selected benchmark from the `benchmarks` directory
as follows:
```
time binaries/cvc4 --no-e-matching --sygus-inst --no-cegqi --fp-exp --sygus-inst-scope=both --sygus-inst-term-sel=both --sygus-inst-mode=interleave --pre-skolem-quant <BENCHMARK>
```

### Example
Running `both-both` on benchmark `BV/wintersteiger/fmsd13/fixpoint/usb-phy-fixpoint-1.smt2` produces the following output:

```
$ time binaries/cvc4 --no-e-matching --sygus-inst --no-cegqi --fp-exp --sygus-inst-scope=both --sygus-inst-term-sel=both --sygus-inst-mode=interleave --pre-skolem-quant benchmarks/BV/wintersteiger/fmsd13/fixpoint/usb-phy-fixpoint-1.smt2
unsat

real	0m1.276s
user	0m1.268s
sys	0m0.007s
```

**Note**: This was run on a work station with a Ryzen 1800X with 3.6GHz CPU.

The runtime data for this benchmark can either be checked via the `cmpr.py` script with:

```
$ ./cmpr.py data-table2/both-both/ -f /BV/wintersteiger/fmsd13/fixpoint/usb-phy-fixpoint-1.smt2
                               status result  exit_code  time_cpu  memory benchmark               config
usb-phy-fixpoint-1.smt2            ok  unsat          0       2.4    58.2
```

or by inspecting the provided log files in

```
$ cat data-table2/both-both/non_incremental_BV/wintersteiger/fmsd13/fixpoint/usb-phy-fixpoint-1.smt2/run.out

c host:       barrett33.stanford.edu
c start:      Tue Oct 13 04:38:32 PDT 2020
c arrayjobid: 163160053
c jobid:      163344143
c command:    /sygus-both-both-interleave/cvc4 --no-e-matching --sygus-inst --no-cegqi --fp-exp --sygus-inst-scope=both --sygus-inst-term-sel=both --sygus-inst-mode=interleave --pre-skolem-quant /SMT-LIB/non-incremental/BV/wintersteiger/fmsd13/fixpoint/usb-phy-fixpoint-1.smt2
c args:       /SMT-LIB/non-incremental/BV/wintersteiger/fmsd13/fixpoint/usb-phy-fixpoint-1.smt2
2020-10-13 04:38:32,265 - INFO - LXCFS is not available, some host information like the uptime leaks into the container.
2020-10-13 04:38:32,268 - INFO - Starting command /sygus-both-both-interleave/cvc4 --no-e-matching --sygus-inst --no-cegqi --fp-exp --sygus-inst-scope=both --sygus-inst-term-sel=both --sygus-inst-mode=interleave --pre-skolem-quant /SMT-LIB/non-incremental/BV/wintersteiger/fmsd13/fixpoint/usb-phy-fixpoint-1.smt2
2020-10-13 04:38:32,268 - INFO - Writing output to /sygus-both-both-interleave/non_incremental_BV/wintersteiger/fmsd13/fixpoint/usb-phy-fixpoint-1.smt2/output.log
starttime=2020-10-13T04:38:32.462632-07:00
returnvalue=0
walltime=2.468061118386686s
cputime=2.44409025s
cputime-cpu18=2.443987523s
cputime-cpu7=0.000102727s
memory=58245120B
blkio-read=0B
blkio-write=0B
c done

```

## Rerunning Experiments in a more Automated Way

For convenience we also provide the script `run-solver.sh` that allows to rerun
subsets of the experiments.

The script can then be invoked as follows:

```
./run-solver.sh data-table4/syqi/options benchmarks/NIA/
```

where the first argument specifies the solver configuration to be run and the
second argument points to the benchmarks to be run. The above command will run
the syqi configuration in Table 4 on benchmark set NIA.


### Example Output
```
Benchmark: NIA/tptp/ARI123=1.smt2
  CPU time from log: 0.015325465
  Status from log:   unsat
  CPU time from run: 0.007
  Status from run:   unsat

Benchmark: NIA/tptp/NUM882=1.smt2
  CPU time from log: 0.005401276
  Status from log:   sat
  CPU time from run: 0.004
  Status from run:   sat

Benchmark: NIA/tptp/NUM878=1.smt2
  CPU time from log: 0.007369659
  Status from log:   unsat
  CPU time from run: 0.000
  Status from run:   unsat

Benchmark: NIA/tptp/NUM879=1.smt2
  CPU time from log: 0.005251154
  Status from log:   sat
  CPU time from run: 0.000
  Status from run:   sat

  ...
```


## Publication of Artifact

The artifact will be published at
[https://cvc4.github.io/papers/tacas2021-syqi](https://cvc4.github.io/papers/tacas2021-syqi)
and will be permanently archived on [https://zenodo.org](https://zenodo.org).