Tseitin Encoding and DIMACS Format

Short description

Write a program which translates a description of a formula in NNF into a DIMACS CNF formula using Tseitin encoding.

Input format

The input file contains a description of a single formula in NNF using a very simplified SMT-LIB format following grammatical rules:

<formula> ::= `(' `and' <formula> <formula> `)'
          | `(' `or' <formula> <formula> `)'
          | `(' `not' <variable> `)' 
          | <variable>

Here <variable> is a sequence of alphanumeric characters starting with a letter. Newline or a sequence of whitespace characters should be allowed wherever space is allowed.

See the example input files

Output format

The output should be a CNF description in the DIMACS format.

c
c start with comments
c
c 
p cnf 5 3
1 -5 4 0
-1 5 3 4 0
-3 -4 0

The file can start with comments, that is lines beginning with the character c.
Right after the comments, there is the line p cnf nbvar nbclauses indicating that the instance is in CNF format where nbvar is the maximum index of a variable appearing in the file; nbclauses is the exact number of clauses contained in the file.
Then the clauses follow.
- Each clause is a sequence of distinct non-null numbers between -nbvar and nbvar ending with 0
- Positive numbers denote the corresponding variables. Negative numbers denote the negations of the corresponding variables.
- The clauses are seperated with 0, not newlines — a single clause may be split among several lines and several clauses may be at the same line
- A clause should not contain the opposite literals i and -i simultaneously

Use the part with comments to describe which variables in the CNF instance represent which original variables from the input formula. It should be also clear which variables are auxiliary (for gates) and which variable in the CNF correspond to the root node of the derivation tree of the input formula.

Extended Description

The typical invocation of your program should be

formula2cnf [input [output]]

Parameters input and output specify the input and output files respectively, if they are missing standard input should be read or standard output should be written to.

The program should allow an option which specifies if the Tseitin encoding should use equivalences or only left-to-right implications.

NAIL094 — Decision procedures and verification

Petr Kučera, KTIML MFF UK

Tseitin Encoding and DIMACS Format

Short description

Input format

Output format

Extended Description