Wednesday, November 18, 2015

Week 5: Natural Image Completion

Goal

Captura.JPG

The goal of this week is to implement the scene matching algorithm described the Scene Completion Using Millions of Photographs by Heys et al.

To do so, we go through a large dataset of images and find the similar ones. Once we have the most similar images from our dataset, we do as follows:

  1. We calculate the image hole using the bounding box of our mask. This hole is the crop of the image we are going to substitute.

untitled.jpg

  1. We extend the mask by 40 pixels. We calculate the crop of the image we want to add in our original image using the size of the bounding box of the mask below (see the red rectangle). In the figure below we see an example of a chosen patch image on the left and the extended mask on the right with the corresponding bounding box.
patch.jpg
  1. We calculate the probability of each pixel of belonging to the mask or not. Wexplain this in the section “Fill the CreateGridUGMModel.m function”.

  1. Knowing the cost of every pixel of belonging to the mask, complete the hole of the original image using Poisson (UGM_Decode_LBP.m function).

  1. This are some of the results we obtain:

IMG_0681.jpg
IMG_6548.jpg
IMG_0681_25.jpg



Select number of states. In our case 2. 

Each pixel, or node, has two states: isMask = true or isMask = false

Fill the CreateGridUGMModel.m function.

Use image dimensions and number of states to create a structure containing information about the edges, that is, the cost function between each pixel p and q.

Build Unary (node) potentials.

We assign 2 different costs to each pixel. The cost of leaving the mask Cd (p, exists) and the cost of becoming part of the mask Cd (p, patch).

Pixels inside the mask

The cost of leaving the mask is infinite, and the cost of becoming part of the mask is 0, because they are already inside it.

Pixels outside the mask

The cost of leaving the mask is 0, because they are already outside the mask, and the cost of becoming part of the mask is depends on the distance to it. If they are far away from the mask, it will be infinite. It is computed using the following equation:

C.png

In our algorithm we have to use probabilities instead of costs to do the computations. So we have to translate those costs to probabilities. To do that we use the formula:

Screen Shot 2015-11-18 at 12.01.30.png


prob.PNG

(generar graph mejor y con axis labels)

These are the resulting probabilities:


P (p, exists)
P (p, patch)
p ϵ mask
0
1
p ϵ mask
1
1 / (k * dist(p, mask)) ^3 +1)


Build pairwise factors

First of all, we have to compute the SSD that will be used to build pairwise factors for pixels p,q. That’s the squared difference between the original image and the source image. As they are RGB images, we compute the differences of the channels separately, and then we apply L2.

If p,q are not 4-connected or if they belong to the same region (exists or patch), its edge potential will be 0. Otherwise, the potential will be the magnitude of the difference between the SSD for pixel p and the SSD for pixel q. That’s:

Potential = ssd(p, q);

The following table gathers the values that the potential takes depending on the pixels’ labels:

   
L(p)
L(q)
Cost
exist
exist
0
exist
patch
ssd(p, q)
patch
exist
ssd(p, q)
patch
patch
0

Call inference algorithms to solve the MAP problem

Using the unary potentials, the pairwise potentials and the graphical model structure, we call the MAP algorithm that solves our problem using Poisson editing.

Tuesday, November 10, 2015

Week 4: Poisson Editing

Goal



This week we wanted to solve the image editing problem of cloning areas from one images and adapting them to another one. We implement a method that uses the Poisson partial differential equation with Dirichlet boundary conditions to achieve a seamless clone.


girl.png
src (girl.png)
lena.png
dst (lena.png)


lenafea.jpg

dst with src eyes & mouth





Mandatory Tasks



Implementation



We study the method described in “Poisson Image Editing” by Patrick Pérez et al. and implement it in two different ways in order to fit src eyes and mouth into dst eyes and mouth seamlessly.


Np.PNG
Set of connected neighbors of pixel p (in our case, 4).
fp.PNG
Value of f at p within unknown area.
fq.PNG
Value of f at q (where q is an element of Np) within unknown area.
f_ast_q.PNG
Value of f at q within known area (src - dst, which includes the boundary)
vpq.PNG
Difference between value of g at pixel p and value of g at neighbor q


eq7.PNG


Rearranging the equation above taken from Pérez, we can see that in order to obtain the value for a pixel p, we need to compute the sum of all the nearest neighbours in the border, in the domain, and the difference between the pixels in g and their 4-neighbours.
rearranged_eq7.PNG




G4_PoissonEditing_GS
This function uses the Gauss Seidel scheme to solve the discrete Poisson pixel by pixel iteratively.  We crop the masks and images at their corresponding location,


Resulting image using Gauss-Seidel


Within this function, we call G4_gaussSeidel for each color channel.
   
G4_gaussSeidel
This function iteratively solves the equation
poissoneq.PNG
as such:
f_ij.PNG


G4_PoissonEditing_CG
This function instead uses the preconditioned gradients method to solve the Poisson equation in the form Ax = b.  A is the matrix created by G4_createA (explained below). b is the laplacian of the source image with the destination boundaries, and x if the resulting image f.


Resulting image using PCG


G4_createA
Creates the matrix A for solving the discrete Poisson equation with the pcg method. This is a sparse matrix of the form:
Amatrix.PNG


In G4_PoissonEditing_CG we create a matrix A of dimensions 990x990px (for the eyes). We modify in this matrix the corresponding values in order to conserve the lenna crop borders information. It looks like the following:



Full 990x990px Poisson matrix.
Close-up of pattern (white = 4, black = -1, gray =1, background = 0))

The rows having only one gray pixel (value = 1), like the first ones and the last ones, correspond to the boundaries, and will let us assign directly the value of B to f . The other rows apply the Laplacian operator to each pixel.

Week 3: Image Segmentation

Goal



The goal of the third block was to successfully implement the Chan-Vese algorithm in order to detect contours around binary and color images. This will be done by considering, as usual, a minimization problem, which we will formulate according to level set theory.



Mandatory Tasks



Implementation of explicit Chan-Vese algorithm



In 2D, the level set method amounts to representing a closed curve (such as the shape boundary) using an auxiliary function, phi (our level set function). It is initialized in the code as a circle in the center of the image of radius 50, and it is given Neumann boundary conditions.


The determination of the image boundaries is done by minimizing the functional below. We can see that the functional is composed by four main parts.
chanvese_formula.PNG
  • Length parameter (µ). It is the first term of the functional, and penalizes boundaries that have higher length using the constant. The higher the , the more the boundary stretches itself to fit around the foreground with a smaller perimeter. This way, we avoid unnecessary details in the boundary for the noisy images.
  • Area parameter (ν). It penalizes the boundaries that enclose bigger areas.
  • Discrepancy between f and u (λ1, λ2). The first term it is related to the discrepancy between the given grayscale image (f) and the foreground. And the second term to the discrepancy between the grayscale image and the background.


We initialized the region averages c1 and c2 to 1 and -1, which will be computed by discretizing the following equations:
c1.PNG
c2.PNG.


We based our numerical implementation on the semi-implicit scheme described in Gertreuer, but applied it explicitly. This consists in one Gauss-Seidel sweep from left to right, top to bottom:eq22.PNG



Results (Binary)
mu-0.2-it-850-eta0.0000008-dt-0.5-.jpg
circles: mu = 0.2, it = 850, dt = 0.5


mu-jpg-it-5000.jpg
circles: mu = 0.2, it = 5000, dt = 0.5


-mu-0.5-it-22000.jpg
phantom17: mu = 0.5, it = 22000



-mu-1-it-20000.jpg
phantom17: mu = 0.5, it = 22000


Results (Noise)
mu-0.05-23000.jpg
noisedCircles: mu = 0.05, it = 23000
mu-0.05-30000.jpg



noisedCircles: mu = 0.05, it = 30000


mu-0.1-it-22400.jpg
phantom18: mu = 0.3, it = 20000


mu-0.3-it-20000.jpg
phantom18: mu = 0.1, it = 22400




It might be worth noting that the closer the initialization of phi is to the shape of the contour we are looking for, the faster/better the convergence. For example, for the phantom images, if phi is initialized as a diagonal line of the form y = -ax instead of a circle, it converges significantly faster to the shape of the boundary.


As for the case of the phantom18, instead of initializing the phi function as a circle centered in the image, we chose a displaced circle for the same reason mentioned above.


Curve Evolution







Limitations of Chan-Vese

It can be seen that in image phantom19, the Chan-Vese algorithm fails to correctly find the boundary despite going through 20,000 iterations.


-mu-0.3-it.20000.jpg
phantom19: mu = 0.3, it = 20000



This happens when the region averages c1 and c2 are approximately equal. These are phi’s principal guidance for where to go to separate the foreground and the background, so it stands to reason that if they both have very similar values, phi will not be able to distinguish the regions.


For phantom19, c1 =  0.5408 and c2 = 0.5401. They differ by 0.0007, whereas for circles, for example, the region averages (0.2494 and 0.1845) differ by a significantly larger 0.0651.