- April 13, 2017 - Release of version 3. Introduced new Karatsuba method, updated to Python 2.7.
- July 18, 2011 - Source code of version 2 released. Download it here.
- June 30, 2011 - Release of version 2. Fixed a few bugs in Factoring Code. Source code will be released soon.
- May 31st, 2011 - Release of version 1

**Introduction**

Welcome to *ToughSAT*. This site will generate boolean CNF formulas that encode "difficult" problems. For example, you can generate a boolean CNF formula whose satisfying assignment encodes non-trivial factors of an input number. Finding the satisfying assignment for this boolean formula is at least as hard as factoring the input number, a problem which is believed to be difficult for classical computers.

We hope this site is both useful as an education tool (for students learning about NP-completeness and complexity theory, say), or for testing SAT solvers, or for computer science research. Please feel free to contact either of us with comments, questions, suggestions.

The boolean CNF formulas are output in the DIMACS format, which is the standard format that many SAT solvers use. Examples of SAT solvers: MiniSat, Sat4J, RSat, UBCSat, and many others.

**Types of Problems**

- Factoring
- Subset Sum
- Random
*k*-SAT - Cocktail (combines all of above)

**Getting started with MiniSat**

Choose a Problem Type, set your parameters, and click "Generate". Save the file to the same folder as your MiniSat executable (for Windows users, we have provided a zip file containing MiniSat and the requisite cygwin libraries here). In the command line, execute the following command:

MiniSat.exe <downloaded SAT file> <output file>

The output file will contain the satisfying assignment to the boolean formula (if it exists).

**Other**

This site is powered by Python. Questions? Comments? E-mail us at either hyuen AT DOMAIN.EDU or bebel AT DOMAIN.EDU - except replace DOMAIN.EDU with usc.edu :)

Problem Type:

**Factoring (Karatsuba)**

Usage: This will generate a CNF formula whose satisfying assignment encodes two non-trivial factors of the product *Factor1*Factor2* (if there exist any).

For a source of big primes, consult this website.

This method uses Karatsuba's algorithm for multiplication to reduce the CNF size relative to the classical or schoolbook algorithm.

**Discussion**

If Factor1 and Factor2 are n-bit numbers, then the product Factor1*Factor2 is (at most) a 2n-bit number. In order to create a formula that encodes this instance of the factoring problem, observe that factorization is the inverse of multiplication. First, we construct the boolean multiplication function as a boolean circuit of AND, OR, and NOT gates using Karatsuba's algorithm.

Second, we implement the boolean indicator function where

again as a boolean circuit. By concatenating the output of the first circuit with the second circuit, we obtain a boolean circuit implementing the function which evaluates to 1 only on input two factors of Factor1*Factor2. As a computational problem, this is an instance of CircuitSAT (a NP-complete language).

To convert this CircuitSAT instance to a CNF-SAT instance, we use an algorithm similar to one described here (http://www.scottaaronson.com/democritus/lec6.html). Our algorithm allows more than 3 literals per clause, which reduces the total number of clauses and variables in the final CNF-SAT instance. The algorithm can also construct a standard 3SAT instance by restricting the number of literals per clause to 3, at the expense of additional variables and clauses.

Finally, we add on extra clauses that prohibit the constant 1 from being a factor, because we are only interested in non-trivial factorizations of Factor1*Factor2.

**Factoring**

Usage: This will generate a CNF formula whose satisfying assignment encodes two non-trivial factors of the product *Factor1*Factor2* (if there exist any).

For a source of big primes, consult this website.

**Discussion**

If Factor1 and Factor2 are n-bit numbers, then the product Factor1*Factor2 is (at most) a 2n-bit number. In order to create a formula that encodes this instance of the factoring problem, observe that factorization is the inverse of multiplication. First, we construct the boolean multiplication function as a boolean circuit of AND, OR, and NOT gates using the standard, grade-school multiplication algorithm.

Second, we implement the boolean indicator function where

again as a boolean circuit. By concatenating the output of the first circuit with the second circuit, we obtain a boolean circuit implementing the function which evaluates to 1 only on input two factors of Factor1*Factor2. As a computational problem, this is an instance of CircuitSAT (a NP-complete language).

To convert this CircuitSAT instance to a CNF-SAT instance, we use an algorithm similar to one described here (http://www.scottaaronson.com/democritus/lec6.html). Our algorithm allows more than 3 literals per clause, which reduces the total number of clauses and variables in the final CNF-SAT instance. The algorithm can also construct a standard 3SAT instance by restricting the number of literals per clause to 3, at the expense of additional variables and clauses.

Finally, we add on extra clauses that prohibit the constant 1 from being a factor, because we are only interested in non-trivial factorizations of Factor1*Factor2.

A similar multiplier-based Factoring-to-3SAT reduction using the Chinese Remainder theorem, rather than a grade-school multiplication algorithm, is in the works. The multiplication algorithm is described here (http://arxiv.org/PS_cache/cs/pdf/9809/9809117v2.pdf)

**Factoring 2**

Usage: This will generate a CNF formula whose satisfying assignment encodes two non-trivial factors of the *Target *(if there exist any)..

**Random Subset Sum**

Usage: This will generate a random subset sum instance (i.e.the ground set is of size *Set Size* and whose elements are random positive integers less than *Max Number*, and the target number is the sum of a random subset of the ground set), and encode the instance as a boolean CNF formula. A satisfying assignment to the formula will encode a subset of the ground set that adds up to the target number.

**Manual Subset Sum**

**Random k-SAT**

Usage: This will generate a random *k*-SAT boolean formula with the specified number of variables and clauses

**Commentary**

For a rough estimate of the transition from an easy-to-satisfy instance to hard-to-satisfy instance of a *k*-CNF boolean formula, we can use the following formula: *c* = log(2* ^{k}*/(2

Note, the clause-to-variables ratio is simply a *heuristic*, and not a hard indicator of the number of satisfying assignments.

Here are two plots comparing the difficulty of 4-CNF instances (as measured by the time it takes to solve these instances on MiniSat) against the clause-to-variables ratio:

In both plots there's clearly a peak around the clause-to-variables ratio = 10 (the above estimate give critical threshold *c ~ 10.7*). However, the vertical axis on the right figure exhibits a much larger range - this is due to the instances on the right hand side were random 4-CNF formulas converted to 3-CNF (so the clause-to-variables ratio reflects that of the converted 3-CNF formulas), whereas the instances on the left-hand side were 4-CNF formulas that remained 4-CNF.

**Cocktail SAT**

Usage: This will generate a composite boolean formula that has several sub-problems embedded in it.