cs180

Fun with Filters and Frequencies

Part 1.1: Finite Difference Operator

We use the finite difference operators as defined below:

D_x = np.array([[1, -1]])
D_y = np.array([[1], [-1]])

Convolving these with the camera image using scipy.signal.convolve2d with mode='same' and boundary='symm', we get the following images. For the binarized image, we used a threshold of 0.2. We computed the pixel-wise gradient magnitude using np.sqrt(dx_deriv ** 2 + dy_deriv ** 2), which gets the L2 norm of the total gradient vector.

img

Partial x derivative

img

Partial y derivative

img

Gradient magnitude

img

Binarized gradient magnitude with threshold 0.2

Part 1.2: Derivative of Gaussian (DoG) Filter

We blur the image by convolving it with a 2D Gaussian filter of kernel size 7 and standard deviation 1. Then, we repeat the procedure from part 1.1 on the blurred cameraman image.

img

Gradient magnitude

img

Binarized gradient magnitude with threshold 0.1

Comparing this to the results in part 1.1, we see that these images are much less noisy, and the edges in the image appear more clear, thick, and rounded. The noisy edges at the bottom of the image are also gone. This is because the blurring the initial image removes the high frequency components of the image since the Gaussian filter is a low pass filter. This eliminates noise and causes edge detection to be more accurate.

We check that we get the same results by convolving the gaussian with D_x and D_y first.

img

We convolve these “derivative of Gaussian” filters with our original image (unblurred) to get the images below.

img

Gradient magnitude

img

Binarized gradient magnitude with threshold 0.1

The images look almost exactly the same as the images we got after blurring the image and then applying D_x and D_y, so these two techniques have the same effect.

Part 2.1: Image “Sharpening”

We “sharpen” an image following this procedure:

  1. Convolve the image with a Gaussian kernel to get the low frequencies of the image
  2. Calculate the high frequencies using details = original - blurred
  3. Get the sharpened image using sharpened = original + alpha * details

For the Taj Mahal image, we used a Gaussian kernel size 7, Gaussian standard deviation 1, and alpha = 2.

For the dog image, we used Gaussian kernel size 9, Gaussian standard deviation 1.5, and alpha = 2.

img

Original Taj Mahal image

img

Sharpened Taj Mahal image

img

Original dog image

img

Sharpened dog image

We blur the sharpened dog image and attempt to resharpen it afterwards. To blur, we used Gaussian kernel size 5 and standard deviation 1. To resharpen, we used Gaussian kernel size 7, Gaussian standard deviation 1, and alpha = 2.

img

Blurred sharpened dog image

img

Resharpened dog image

Some features in the resharpened image look sharpened compared to the original dog image, such as the cracks in the ground and the stairs in the background. However, there are many edges and details that still appear blurred, such as the dog’s fur. This is because blurring the sharpened image removes the high frequency content. When we try to sharpen the image after blurring, there is not as much high frequency content to add back to the image, so the standard sharpening process does not work properly.

Part 2.2: Hybrid Images

We create hybrid images by combining the low frequencies of one image with the high frequencies of another image. This allows the hybrid image to show the high-frequency image when the viewer is close, and it shows the low-frequency image when the viewer is farther away.

To get the low frequency image, we apply a Gaussian blur. To get the high frequency image, we apply a Gaussian blur and then calculate details = original - blurred, and we use details as the high frequencies. We then average the low and high frequency images pixel-wise to obtain the final hybrid image.

Low Frequency Image
High Frequency Image
Hybrid Image
img

Derek

img

Nutmeg

img

Low frequency: kernel size 41, stdev 6
High frequency: kernel size 55, stdev 7

img

Smiski researching

img

Smiski presenting

img

Low frequency: kernel size 41, stdev 6
High frequency: kernel size 15, stdev 2.5

img

Leafeon (Pokemon)

img

Sylveon (Pokemon)

img

Low frequency: kernel size 41, stdev 6
High frequency: kernel size 15, stdev 2

For the Smiskis, the eyes of the high-frequency Smiski still show up pretty clearly even when looking at the hybrid image from far away, since the eyes are very dark compared to the rest of the Smiski’s face and body. Otherwise, the hybrid images seem to work.

We do Fourier analyis on the Leafeon/Sylveon hybrid.

img

Leafeon FFT

img

Sylveon FFT

img

Low frequency Leafeon FFT

img

High frequency Sylveon FFT

img

Hybrid image FFT

For a failure case, we try to make a hybrid of a paper crane and a real crane.

Low Frequency Image
High Frequency Image
Hybrid Image
img

Paper crane

img

Real crane

img

Low frequency: kernel size 41, stdev 6
High frequency: kernel size 35, stdev 5

The crane hybrid does not really work because the dark lines of the paper crane are still very visible when viewing the image up close. These are high frequency components that don’t get blurred out enough by the Gaussian filter.

Bells and Whistles: Color

We try using color on the Leafeon/Sylveon hybrid to see if color will enhance the hybrid effect.

img

Both grayscale

img

Leafeon color, Sylveon grayscale

img

Leafeon grayscale, Sylveon color

img

Both color

Color does not seem to enhance the effect very much. In particular, using color for the high frequency image seems insignificant, since the process of subtracting the blurred image from the original already removes so much of the image’s color. Using color for the low frequency image does not seem to make the hybrid effect better than just using grayscale.

Part 2.3: Gaussian and Laplacian Stacks

We create Gaussian and Laplacian stacks for both the apple and orange images. At every level of the Gaussian stack, we use a Gaussian kernel to blur the previous level to get the current level’s output, which maintains the image’s size across all levels of the stack. Each level of the Laplacian stack except for the last level is calculated from the Gaussian stack using l_stack[i] = g_stack[i] - g_stack[i+1]. For the last level of the Laplacian stack, we directly use the result from the last level of the Gaussian stack. This means that both stacks end up with the same number of images.

Here are levels 0, 2, 4, 6, and 7 of my Laplacian stack, where we use a total of 8 layers (so layer 7 is the last). These levels are shown from top to bottom. From left to right, the columns are: apple, orange, masked apple, masked orange, combined masked apple + masked orange.

img

Each Laplacian stack image shown here is normalized (over the entire image, not by channel). However, when doing the multiresolution blending, we use the un-normalized versions of the Laplacian stack outputs.

Part 2.4: Multiresolution Blending

To blend two images A and B together, we generate the Laplacian stacks for each of these images A_lstack and B_lstack. We also generate a Gaussian stack for the mask mask_gstack. To combine the images with a smooth blend, we compute (1 - mask_gstack[i]) * A_lstack[i] + mask_gstack[i] * B_lstack[i] for each level i in the respective stack, and we add all of these contributions together. This works because 1 - mask_gstack[i] reverses the mask, so the contribution from A is the “excluded” part of the original mask and the contribution from B is the “included” part of the original mask. The final output image has a smooth overall blend because we blend each band of frequencies through the Laplacian stack. We normalize the result for the final output.

All of these blends are done with 8 stack layers. The parameters used for generating the Gaussian stack (and therefore the Laplacian stack) are stated in the caption under each image.

In the images below, we added a red border around the masks to better show where the white part of the mask is.

Image 1
Image 2
Mask
Blended image
img

Apple
Gaussian stack kernel size 7
stdev 2

img

Orange
Gaussian stack kernel size 7
stdev 2

img

Vertical mask
Gaussian stack kernel size 31
stdev 15

img

Blend
Apple
Orange

img

New York City
Gaussian stack kernel size 7
stdev 2

img

Flower field
Gaussian stack kernel size 7
stdev 2

img

Horizontal mask
Gaussian stack kernel size 7
stdev 2

img

Blend
New York City
Flower field

Here are the Laplacian stack images for the flower city. These are levels 0, 2, 4, 6, and 7 of the Laplacian stacks, where we use a total of 8 layers (so layer 7 is the last). From left to right, the columns are: city, flower field, masked city, masked flower field, combined masked city + masked flower field.

img

For my irregular mask, I blended a rubber duck’s head with a real duck.

img

Original real duck

img

Original rubber duck

img

Aligned real duck
Gaussian stack kernel size 7
stdev 2

img

Aligned rubber duck
Gaussian stack kernel size 7
stdev 2

img

Irregular mask
Gaussian stack kernel size 31
stdev 9

img

Blend
Real duck
Rubber duck

Reflection

The most important thing I learned during this project was how different frequencies in images affect our perception of those images. It was fun playing around with frequencies to blend images together, and I liked creating hybrid images that showed me how we see high frequencies close up and low frequencies from far away. I learned a lot about how changing frequencies can impact what we see in images.