M. Pawan Kumar














The regions of a dictionary are indexed using r or r'. The intersection of all the regions provides a set of super-pixels, which are indexed using s. The label set of the high-level vision task is denoted by L = {1,...,h}. We introduce an augmented label set L' = {0} ∪ L, where the label 0 indicates that the region is not selected. The unary potential of a region r taking a label i in L is denoted by θr(i). The pairwise potential of two neighboring regions (regions that share at least one boundary pixel and do not overlap) r and r' taking labels i and j in L is denoted by θrr'(i,j).

Integer Programming Formulation

Using the above notation, the problem of simultaneously selecting the regions and their labels can be cast as the following integer program:

Integer Program


The binary variable yr(i) indicates whether a region r is assigned a label i from the set L' or not. The objective function corresponds to the energy of the solution specified by y. The constraints specify that each region is either not selected, or if selected it is assigned exactly one label. The last constraint ensures that the entire image is explained and no two overlapping regions are selected.