Guide to Constraint Programming

© Roman Barták, 1998





Modeling and Solving Real-Life Problems

A very important part of solving real-life problems using constraints is modeling the problem in terms of constraints, i.e., transforming the problem description from the natural language to the language of constraints. Both choosing the right model and choosing the right constraint satisfaction algorithm is crucial for efficient solving of the problem.


This example illustrates the difference between two models chosen to solve the well-known SEND+MORE=MONEY puzzle. This puzzle consist in giving to each letter {S,E,N,D,M,O,R,Y} a different digit from {0,...,9} so that the equation SEND+MORE=MONEY is satisfied.

Without Carry

The easiest way to model this problem is to design one constraint corresponding to the equation:

1000*S+100*E+10*N+D + 1000*M+100*O+10*R+E = 10000*M+1000*O+100*N+10*E+Y

where each letter represents the variable with the domain {0,..,9}. Of course, to comply with the problem description, we need to add the constraint all_different([S,E,N,D,M,O,R,Y]) or the set of constraints S#E,...,R#Y.

This model is not very efficient because all but one variable have to be instantiated before the "main" constraint can be tested. Although the additional constraint(s) reduce the search space, there remain a lot of possible valuations satisfying the all_different constraint that has to be explored.

With Carry

More efficient model uses the carry bits to disjoint the "big" constraint into small constraints:

   E+D = Y+10*C1
C1+N+R = E+10*C2
C2+E+O = N+10*C3
C3+S+M = 10*M+O 

plus the all_different([S,E,N,D,M,O,R,Y]) constraint.

The advantage of this model is that these small constraints can be tested earlier in the labeling phase and, thus, many inconsistent valuations are pruned.


  Microcode Label Assignment Problem

From: P.V. Hentenryck, Constraint Satisfaction in Logic Programming, The MIT Press, 1989

The problem is to assign labels of symbolic microcode to binary addresses in a 256-address page of microcode memory. A typical microcode instruction has the following form:

L0: opcode L1,...,Ln.

where L0 is the origin label and L1,...,Ln are the target labels. To improve efficiency and decrease memory usage, the labels share certain bits and contain the branch condition as part of the address. Therefore, the labels cannot be assigned to consecutive addresses as in assembly language, but must be distributed over the address space to satisfy all constraints.

Example: (4-way branch instruction)

L0: branch4 L1,L2,L3,L4

look at the binary representation of each label that consist of eight bits (remember, 256-address page)

L0: X7 X6 Y5 Y4 Y3 Y2 Y1 Y0
L1: X7 X6 X5  0  0 X2 X1 X0
L2: X7 X6 X5  0  1 X2 X1 X0
L3: X7 X6 X5  1  0 X2 X1 X0
L4: X7 X6 X5  1  1 X2 X1 X0
              ^  ^
              |  |
               ------ condition bits

Note, that some (most) bits are common for some labels.

Naturally, each label can be represented by a variable with a domain between 0 and 255 and to express the following constraints one needs bitwise & operation with a mask:

  • Equal Bits. Some bits are shared among the labels (L0&11000000=L1&11000000).
  • Condition Bits. Condition bits are fixed (L2&00011000=00001000.
  • Increments. Some labels directly follows other labels (L1=L0+1)
  • All Different. All labels must be assigned to different addresses to avoid collisions.


Multiple Protein/DNA Sequence Alignment

Multiple alignment of a family of protein/DNA sequences is a fundamental step in the study of similarity in their shape and function. The goal of multiple alignment is to insert symbols "-" representing gap into k sequences in such a way that the resulting k sequences have the same length and if the resulting sequences are treated as a matrix then every column contain at least one character different from "-".



The key point is that we do not simply want an arbitrary alignment, we want to find a good alignment which can indicate the similarity of a group of sequences as accurately as possible (the definition of "good" is based on biological knowledge).

Suppose that the input data is k sequences of length n and a maximum number of inserts (to each sequence) is m. We can use two arrays of finite domain variables to represent the data:

  • Position array S with size k x n; each element Si,j corresponds to the shift of j-th letter in the i-th sequence, i.e., the domain of each Si,j is {0,...,m}.
  • Letter array P with size k x (n+m); each element corresponds to the letter in final alignment, i.e., the domain of Pi,j is {-,sij,si j-1...,si j-m} where sij is j-th letter in the original i-th sequence. (Note, that the two occurrences of the same letter should be distinguish by assigning an offset to each letter).

Now, two sets of constraints can be defined:

  • Non-Overlap Constraints: Sij <= Si j+1
  • Link Constraints (bind the above tables):
    • for each z=0,...,m: if Pij = si j-z then Si j-z = z
    • for each z=0,...,m: if Sij = z then Pi j+z = sij & remove sij from Pi j+z+1,...,Pi j+m
    • for each z=0,...,m-1: if Sij>z then remove sij from Pij,...,Pi j+z


Airport Counter Allocation

Hong Kong International Airport

Airport Counter Allocation problem is a typical planning problem. The goal is to allocate enough counters (the number depends on the aircraft type) to each flight. In the airport, the counters are grouped in islands and for each flight all assigned counters have to be in the same island.

Several models have been explored to tackle the counter allocation problems:






Designed and maintained by Roman Barták