Goal
The goal of this week is to implement the scene matching algorithm described the Scene Completion Using Millions of Photographs by Heys et al.
To do so, we go through a large dataset of images and find the similar ones. Once we have the most similar images from our dataset, we do as follows:
- We calculate the image hole using the bounding box of our mask. This hole is the crop of the image we are going to substitute.
- We extend the mask by 40 pixels. We calculate the crop of the image we want to add in our original image using the size of the bounding box of the mask below (see the red rectangle). In the figure below we see an example of a chosen patch image on the left and the extended mask on the right with the corresponding bounding box.
- We calculate the probability of each pixel of belonging to the mask or not. Wexplain this in the section “Fill the CreateGridUGMModel.m function”.
- Knowing the cost of every pixel of belonging to the mask, complete the hole of the original image using Poisson (UGM_Decode_LBP.m function).
- This are some of the results we obtain:
Select number of states. In our case 2.
Each pixel, or node, has two states: isMask = true or isMask = false
Fill the CreateGridUGMModel.m function.
Use image dimensions and number of states to create a structure containing information about the edges, that is, the cost function between each pixel p and q.
Build Unary (node) potentials.
We assign 2 different costs to each pixel. The cost of leaving the mask Cd (p, exists) and the cost of becoming part of the mask Cd (p, patch).
Pixels inside the mask
The cost of leaving the mask is infinite, and the cost of becoming part of the mask is 0, because they are already inside it.
Pixels outside the mask
The cost of leaving the mask is 0, because they are already outside the mask, and the cost of becoming part of the mask is depends on the distance to it. If they are far away from the mask, it will be infinite. It is computed using the following equation:
In our algorithm we have to use probabilities instead of costs to do the computations. So we have to translate those costs to probabilities. To do that we use the formula:
(generar graph mejor y con axis labels)
These are the resulting probabilities:
P (p, exists)
|
P (p, patch)
| |
p ϵ mask
|
0
|
1
|
p ϵ mask
|
1
|
1 / (k * dist(p, mask)) ^3 +1)
|
Build pairwise factors
First of all, we have to compute the SSD that will be used to build pairwise factors for pixels p,q. That’s the squared difference between the original image and the source image. As they are RGB images, we compute the differences of the channels separately, and then we apply L2.
If p,q are not 4-connected or if they belong to the same region (exists or patch), its edge potential will be 0. Otherwise, the potential will be the magnitude of the difference between the SSD for pixel p and the SSD for pixel q. That’s:
Potential = ∇ssd(p, q);
The following table gathers the values that the potential takes depending on the pixels’ labels:
L(p)
|
L(q)
|
Cost
|
exist
|
exist
|
0
|
exist
|
patch
|
∇ssd(p, q)
|
patch
|
exist
|
∇ssd(p, q)
|
patch
|
patch
|
0
|
Call inference algorithms to solve the MAP problem
Using the unary potentials, the pairwise potentials and the graphical model structure, we call the MAP algorithm that solves our problem using Poisson editing.