Random Notes on Different Image Processing Algorithms

I decided to jot down some notes made from some algorithms that I previously worked on. These are mostly high level descriptions providing some intuition on how things work.

Exposure Fusion:

  • Is a method of producing a high quality LDR (low dynamic range) image from a series of LDR images taken at different exposures (bracketed exposure)
  • It does not require a conversion step to HDR, e.g. like tone mapping that is needed for a display device to show the HDR image properly
  • Global operators for tone mapping – Apply some scaling to the entire image to convert to LDR
  • Local operators – Do different things in different regions of the image to preserve detail
  • Exposure Fusion uses 3 quality measures — namely ‘contrast’, ‘saturation’ and ‘well-exposedness’
    • For contrast – Apply Laplacian Filter
    • For saturation – Compute the standard deviation within R,G,B channels
    • For well-exposedness – We want to keep pixels which are neither over-exposed (close to 1) nor under-exposed (close to 0) , so we want to keep values close to 0.5 using a Gaussian curve
  • We combine the above info into a weight map using multiplication
  • If we just apply the weight map to each of the images in the bracketed exposure there will very sharp transitions and undesirable halos etc.
  • To counter that, we blend images at multiple resolutions using a pyramid method
    • First create a Laplacian pyramid of the images
    • Then create a Gaussian pyramid of the weight maps
    • Combine at each level
    • Finally you end up with one pyramid – Now repeat the sequence of ‘Up sample, blur, combine’ until you get the final image

Panorama Stitching with Ransac Homography:

  • Use SURF/SIFT feature detector to get the keypoints for both the images
  • Use Flann Matcher to get the matches
  • Find the min and max distance between keypoints
  • Take matches whose distance is less than some threshold (3 * min distance)
  •  Find homography using these set of matched points
    • A matrix H which transforms vector a to vector b (aH = b, H = b * inverse(a))
    • vector a & b contain the 4 points respectively
    • use singular value decomposition to solve for this matrix
    • Use this matrix to find out how many good matched points are transformed successfully, check ‘inliers’ that are below a threshold value in terms of distance
    • Find the matrix that produces the best set of inliers, that is the homography matrix to use
    • warp the image using this matrix

An eigenvector v of a linear transformation T is a non-zero vector that, when T is applied to it, does not change direction

Sobel Filter:

  • Edge detection filter that accounts for gradients in both the x and y directions.
  • 2 filters are used, namely:


Laplace Filter:


Laplacian of Gaussian:

Laplace filter is extremely sensitive to noise, so smoothening it with a Gaussian filter first is good.

Reinhard Tonemapping Operator:

  • Summation of log luminance and average
  • operator = L ( 1 + L) — small luminance is scaled by 1 and large ones are penalized.
  • Has 2 operators – global and local

Bilateral Filter:

  • Using a simple Gaussian filter causes halos due to large intensity differences.
  • To counter this, we applied a penalty on the intensity difference.
  • So we start with a spatial Guassian, then a penalty Gaussian on the intensity difference.
  • This is done by creating patches of the image around a certain pixel.
  • The spatial Gaussian is taken of the patch, and the penalty Gaussian is calculated using the intensity difference between the rest of the patch and the current pixel.

Set of operations:

  • Input is the RGB values of radiance, possibly in radiance format
  • Compute the intensity (I) at each pixel. E.g. use a weighted average: 1/61*(R*20 + G*40 + B)
  • Compute the chrominance channels: R/I, G/I, B/I
  • Compute the log of the intensity: L = log2(I)
  • Apply bilateral filter: B = bf(L)
  • Compute the detail layer: D = L-B  – this should contain the edge information
  • Apply an offset and a scale to the base: B’ = (B-o)*s
  • Reconstruct the log intensity: O = exp(B’+D)
  • Put the colors back: R’,G’,B’ = O*(R/I, G/I, B/I)
  • Apply a gamma compression

Scale Invariant Feature Transform (SIFT):

  • Construct a scale space
    • each octave is for one size and then many levels of blur within each octave
  • LoG approximation
    • within octave subtract pairs to get LoG
  • Find keypoints
    • find maxima/minima
    • find subpixel maxima/minima
    • check if point is min or max among the neighbors around and on top and bottom layer in scale space
  • Filter keypoints
  • Keypoint orientation
    • flat both gradients are small, edge – one is big other is small, corner – both are big
    • gradient magnitudes and orientation – using a histogram – 360 deg into 36 bins
  • Feature generation
    • 16×16 window around a keypoint, 4*4 squares within this – calc grad and ori magnitudes within each 4×4 and add to a histogram

Flann Matching:
Fast library for Nearest neighbor search – k nearest neighbors



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s