Welcome to ToughSAT. This site will generate boolean CNF formulas that encode "difficult" problems. For example, you can generate a boolean CNF formula whose satisfying assignment encodes non-trivial factors of an input number. Finding the satisfying assignment for this boolean formula is at least as hard as factoring the input number, a problem which is believed to be difficult for classical computers.
We hope this site is both useful as an education tool (for students learning about NP-completeness and complexity theory, say), or for testing SAT solvers, or for computer science research. Please feel free to contact either of us with comments, questions, suggestions.
Types of Problems
Getting started with MiniSat
Choose a Problem Type, set your parameters, and click "Generate". Save the file to the same folder as your MiniSat executable (for Windows users, we have provided a zip file containing MiniSat and the requisite cygwin libraries here). In the command line, execute the following command:
The output file will contain the satisfying assignment to the boolean formula (if it exists).
This site is powered by Python. Questions? Comments? E-mail us at either hyuen AT DOMAIN.EDU or bebel AT DOMAIN.EDU - except replace DOMAIN.EDU with usc.edu :)
Usage: This will generate a CNF formula whose satisfying assignment encodes two non-trivial factors of the product Factor1*Factor2 (if there exist any).
For a source of big primes, consult this website.
If Factor1 and Factor2 are n-bit numbers, then the product Factor1*Factor2 is (at most) a 2n-bit number. In order to create a formula that encodes this instance of the factoring problem, observe that factorization is the inverse of multiplication. First, we construct the boolean multiplication function as a boolean circuit of AND, OR, and NOT gates using the standard, grade-school multiplication algorithm.
Second, we implement the boolean indicator function where
again as a boolean circuit. By concatenating the output of the first circuit with the second circuit, we obtain a boolean circuit implementing the function which evaluates to 1 only on input two factors of Factor1*Factor2. As a computational problem, this is an instance of CircuitSAT (a NP-complete language).
To convert this CircuitSAT instance to a CNF-SAT instance, we use an algorithm similar to one described here (http://www.scottaaronson.com/democritus/lec6.html). Our algorithm allows more than 3 literals per clause, which reduces the total number of clauses and variables in the final CNF-SAT instance. The algorithm can also construct a standard 3SAT instance by restricting the number of literals per clause to 3, at the expense of additional variables and clauses.
Finally, we add on extra clauses that prohibit the constant 1 from being a factor, because we are only interested in non-trivial factorizations of Factor1*Factor2.
A similar multiplier-based Factoring-to-3SAT reduction using the Chinese Remainder theorem, rather than a grade-school multiplication algorithm, is in the works. The multiplication algorithm is described here (http://arxiv.org/PS_cache/cs/pdf/9809/9809117v2.pdf)
Usage: This will generate a CNF formula whose satisfying assignment encodes two non-trivial factors of the Target (if there exist any)..
Random Subset Sum
Usage: This will generate a random subset sum instance (i.e.the ground set is of size Set Size and whose elements are random positive integers less than Max Number, and the target number is the sum of a random subset of the ground set), and encode the instance as a boolean CNF formula. A satisfying assignment to the formula will encode a subset of the ground set that adds up to the target number.
Manual Subset Sum
Usage: This will generate a random k-SAT boolean formula with the specified number of variables and clauses
For a rough estimate of the transition from an easy-to-satisfy instance to hard-to-satisfy instance of a k-CNF boolean formula, we can use the following formula: c = log(2k/(2k - 1))-1, where k is the number of literals per clause, c is the critical clause-to-variables ratio, and the logarithm is taken base 2. A given k-CNF instance with a clause-to-variables ratio less than this critical value is very likely to be satisfiable, and one with a ratio higher than this critical value is unlikely to be satisfiable.
Note, the clause-to-variables ratio is simply a heuristic, and not a hard indicator of the number of satisfying assignments.
Here are two plots comparing the difficulty of 4-CNF instances (as measured by the time it takes to solve these instances on MiniSat) against the clause-to-variables ratio:
In both plots there's clearly a peak around the clause-to-variables ratio = 10 (the above estimate give critical threshold c ~ 10.7). However, the vertical axis on the right figure exhibits a much larger range - this is due to the instances on the right hand side were random 4-CNF formulas converted to 3-CNF (so the clause-to-variables ratio reflects that of the converted 3-CNF formulas), whereas the instances on the left-hand side were 4-CNF formulas that remained 4-CNF.
Usage: This will generate a composite boolean formula that has several sub-problems embedded in it.