Strings
This page is about strings in CVC4
Contents
Syntax
This document is for the SMT-Lib v2 format.
The Theory of Strings (Quantifier-Free) logic symbol:
(set-logic QF_S)
The logic symbol is IMPORTANT to the Theory of String, and it has to be set up.
If the constraints contain more theories, please add them accordingly, e.g. if it contains BitVector, the symbol should be QF_SBV.
The Theory of Quantified Strings logic symbol:
(set-logic UFSLIA)
Options
To use the experimental functions (disabled by default):
(set-option :strings-exp true)
The strings-exp option is REQUIRED for the releases after March 1, 2014
To use finite model finding mode (false by default):
(set-option :strings-fmf true)
To select the strategy of LB rule application: 0-lazy, 1-eager, 2-no (0 by default):
(set-option :strings-lb 1)
To set up string alphabet cardinality (256 by default, expert option):
(set-option :strings-alphabet-card n)
To set up depth of unrolling regular expression used by the theory of strings (10 by default, deprecated):
(set-option :strings-regexp-depth 20)
The unrolling option will be removed in a new release. Our string engine no longer depends on it.
Escaped Character Literals
\0 … \9 | represents ASCII character 0 … 9, respectively |
\a, \b, \f, \n, \r, \t, \v | represents its corresponding ASCII character (C++ convention) |
\ooo | matches an ASCII character, where ooo consists of (no more than) three digits that represent the octal character code (from 0 to 377). For example, \101 represents ‘A’, while “\437” represent a string with two characters “#7”( *important* ). |
\xNN | matches an ASCII character, where NN is a two-digit hexadecimal character code. NN has to be exactly two hex-digits. If not, an exception will be raised. |
The backslash character (\) : when followed by a character that is not recognized as an escaped character, matches that character. For example, \" matches a double quote(").
In CVC4 string literal output, a non-printable character is printed in the format \xNN, where NN matches its ASCII character, except for \a, \b, \f, \n, \r, \t, \v, which are printed directly.
Strings
To define a string variable:
(declare-fun x () String)
String Concatenation:
(str.++ s t)
where s, t are string terms.
String Length:
(str.len s)
where s is a string term.
Character in String:
(str.at s i)
where s is a string term and i is a natural number. (see partial functions) The index is starting from 0.
Sub-String:
(str.substr s i j )
where s is a string term, i and j are natural numbers. (see partial functions)
Regular Expression
Membership Constraint:
(str.in.re s r)
where s is a string term and r is a regular expression.
String to Regular Expression Conversion:
(str.to.re s)
where s is a string term. The statement turns a regular expression that only contains a string s.
Regular Expression Concatenation:
(re.++ r_1 r_2 ... r_n)
where r_1, r_2, ..., r_n are regular expressions.
Regular Expression Alternation:
(re.union r_1 r_2 ... r_n)
where r_1, r_2, ..., r_n are regular expressions. re.or is for releases before March, 2014.
Regular Expression Intersection:
(re.inter r_1 r_2 ... r_n)
where r_1, r_2, ..., r_n are regular expressions. re.itr is for releases before March, 2014.
Regular Expression Kleene-Star:
(re.* r)
where r is a regular expression.
Regular Expression Kleene-Cross:
(re.+ r)
where r is a regular expression.
Regular Expression Option:
(re.opt r)
where r is a regular expression.
Regular Expression Range:
(re.range s t)
where s, t are single characters in double quotes, e.g. "a", "b". It returns a regular expression that contains any character between s and t.
Experimental Mode
Following functions are under the --strings-exp option. They are under active refinement. Once they are stable, we will move them to the default mode. Please let us know when you have some suggestions.
String Contain:
(str.contains s t)
where s and t are string terms. It returns true if the string s contains the string t. This function determines whether the string t can be found within the string s, returning true or false as appropriate.
String IndexOf:
(str.indexof s t i)
This function returns the position of the first occurrence of the specified value t in the string s after the index i. It returns -1 if the value to search for never occurs.
String Replacement:
(str.replace s t1 t2)
where s, t1 and t2 are string terms. This function searches the string s for the specified value t1, and returns a new string where the first occurrence of the specified value s1 is replaced by the string s2.
String PrefixOf:
(str.prefixof s t)
where s and t are string terms. It returns true if the string s is a prefix of the string t.
String SuffixOf:
(str.suffixof s t)
where s and t are string terms. It returns true if the string s is a suffix of the string t.
String To Integer Conversion:
(str.to.int s)
where s is a string term. It returns the corresponding natural number if s is valid; otherwise, it returns -1. We assume that the solution of s has no leading zeros (for now).
Integer To String Conversion:
(int.to.str i)
where i is an integer term. It returns the corresponding string if i is a natural number; otherwise, it returns an empty string.
Partial Functions
By the definition of partial functions in SMT-Lib Document, the following constraint is satisfiable:
(assert (= (str.substr "a" 2 1) "b"))
However, the following constraints are unsatisfiable:
(assert (= (str.substr "a" 2 1) "b")) (assert (= (str.substr "a" 2 1) "c"))
To achieve a desirable goal, it requires users to guard proper conditions. For example,
(assert (= (str.at x j) "b")) (assert (> j 0)) (assert (> (str.len x) j))
Extension
Together with other engine in CVC4, we can extend new functionality in the theory of strings. For example,
(define-fun fun1 ((?x String) (?s String)) Bool (or (= ?x ?s) (> (str.len ?x) (str.len ?s)) ))
Quantifiers over bounded Integers (with strings in the body) are supported in the experimental mode; however, quantifiers over strings are still under development.
Limitation
The decidability of this theory is unknown. For satisfiable problems (without extensions), our solver is sound, complete and terminating in the FMF mode. For unsatisfiable problems, the termination is not guaranteed; however, user can tune the options for termination.
Examples
Find an assignment for x, where x."ab"="ba".x and the length of x equals to 7.
(set-logic QF_S) (declare-fun x () String) (assert (= (str.++ x "ab") (str.++ "ba" x))) (assert (= (str.len x) 7)) (check-sat)
Find assignments for x and y, where x and y are distinct and their lengths are equal.
(set-logic QF_S) (declare-fun x () String) (declare-fun y () String) (assert (not (= x y))) (assert (= (str.len x) (str.len y))) (check-sat)
Find assignments for x and y, where x.y != y.x.
(set-logic QF_S) (declare-fun x () String) (declare-fun y () String) (assert (not (= (str.++ x y) (str.++ y x)))) (check-sat)
Find a model for x, y and z, where x."ab".y=y."ba".z and z=x.y and x."a"!="a".x.
(set-logic QF_S) (declare-fun x () String) (declare-fun y () String) (declare-fun z () String) (assert (= (str.++ x "ab" y) (str.++ y "ba" z))) (assert (= z (str.++ x y))) (assert (not (= (str.++ x "a") (str.++ "a" x)))) (check-sat)
Find a model for x and y, where both x and y are in the RegEx (a*b)* and they are different but have the same length.
(set-logic QF_S) (set-option :strings-fmf true) (declare-fun x () String) (declare-fun y () String) (assert (str.in.re x (re.* (re.++ (re.* (str.to.re "a") ) (str.to.re "b") )))) (assert (str.in.re y (re.* (re.++ (re.* (str.to.re "a") ) (str.to.re "b") )))) (assert (not (= x y))) (assert (= (str.len x) (str.len y))) (check-sat)