HOME
RESEARCH
PUBLICATIONS
GROUP
TALKS
TEACHING
CV


LEARNING TO RANK USING HIGHORDER INFORMATION
P. Dokania, A. Behl, C. V. Jawahar and M. Pawan Kumar
We consider the problem of using highorder information (for example, persons
in the same image tend to perform the same action) to improve the accuracy
of ranking (specifically, average precision). We develop two learning frameworks.
The highorder binary SVM (HOBSVM) optimizes a convex upper bound of the surrogate
01 loss function. The highorder average precision SVM (HOAPSVM) optimizes a
differenceofconvex upper bound on the average precision loss function.


OPTIMIZING AVERAGE PRECISION USING WEAKLY SUPERVISED DATA
A. Behl, C. V. Jawahar and M. Pawan Kumar
We develop a novel latent APSVM that minimizes a carefully designed
upper bound on the APbased loss function over a weakly supervised
dataset. Our approach is based on the hypothesis that in the
challenging setting of weakly supervised learning, it becomes crucial
to optimize the right accuracy measure. Using publicly available
datasets, we demonstrate the advantage of our approach over
standard lossbased binary classifiers on challenging problems in computer vision.


SELFPACED LEARNING FOR LATENT VARIABLE MODELS
M. Pawan Kumar, B. Packer and D. Koller
We develop an accurate iterative algorithm for learning the parameters of a
latent variable model such as latent structural SVM. Our approach uses the intuition that the learner
should be presented with the training samples in a meaningful order: easy
samples first, hard samples later. At each iteration, our approach simultaneously chooses
the easy samples and updates the parameters.


REGION SELECTION FOR SCENE UNDERSTANDING
M. Pawan Kumar and D. Koller
We consider the problem of simultaneously dividing an image into coherent regions
and assigning labels to regions using a global energy function. We form a large dictionary
of putative regions using bottomup oversegmentation techniques and formulate the
problem of selecting the regions and their labels as an integer program. We provide
an efficient dual decomposition method to solve an accurate linear programming relaxation
of the integer program.


IMPROVED MOVES FOR TRUNCATED CONVEX MODELS
M. Pawan Kumar and P. Torr
We develop a new stMINCUT based movemaking method for MAP estimation of discrete MRFs
with arbitrary unary potentials and truncated convex pairwise potentials.
We prove that our method provides the best known multiplicative bounds
(same as the bounds obtained by solving the standard linear programming
relaxation followed by randomized rounding) for these problems in polynomial
time. We demonstrate the efficacy of our approach using synthetic
and real data experiments.


ANALYSIS OF CONVEX RELAXATIONS FOR MAP ESTIMATION
M. Pawan Kumar, V. Kolmogorov and P. Torr
We analyze several convex relaxations for MAP estimation of discrete MRFs.
We show that the standard linear programming relaxation dominates (provides
a tighter approximation than) a large class of quadratic programming and
second order cone programming relaxations. Our analysis leads to new
second order cone programming relaxations that are tighter than the linear
programming relaxation.


OBJECT CATEGORY SPECIFIC SEGMENTATION
M. Pawan Kumar, P. Torr and A. Zisserman
Given an image containing an instance of a known object category, we
obtain an accurate, objectlike segmentation automatically. We match a partsbased
object category model to the image. Each sample of the
model provides cues about the shape of the particular instance, which are incorporated in
a global energy function. The segmentation is obtained by minimizing the energy using a single
stMINCUT.


LEARNING LAYERED MOTION SEGMENTATIONS
M. Pawan Kumar, P. Torr and A. Zisserman
Given a video we learn a layered representation of the scene for motion segmentation
in an unsupervised manner. The layered representation consists of the shape and
appearance of the various rigidly moving segments in the scene, their occlusion
ordering as well as their framewise transformations. These parameters are estimated
from a video by minimizing a global energy function using block coordinate descent.

