Disclaimer: The notes below are fully/partially NOT created by myself. They are from slides and/or wikipedia and/or textbook. The purpose of this post is simply to learn and review for the course. If you think something is inappropriate, please contact me at “ryan_yrs [at] hotmail [dot] com” immediately and I will remove it as soon as possible.
RGB
 BGR: Order of RGB in OpenCV
 O: Darkest
 255: Lightest
LSDSLAM
 LargeScale Direct Monocular SLAM
Camera Obscura 暗箱
Pinhole Camera 針孔相機
ThinLens Law
 1/u + 1/d = 1/f
 囗一
 u: Length of 囗
 d: Length of 一
 f: Focal length (焦距) of lens
Blur
 Defocus Blur: OutofFocus Blur
 Motion Blur
Perfect Focus
 No blur
Blur Circle (σ)
 Radius of scene point’s image on sensor plane
Focus Setting
 SensortoAperture Distance.
 v = df/(df)
DOF (Depth of Field) 景深
 Range of object distances where σ < 1 sensor pixel.
 相機對焦點前後相對清晰的成像範圍。
Aperture Diameter (D)
 D usually expressed as f/value.
 e.g. Canon 50mm (Focal Lenght) f/12 (Max aperture) lens has max D = 50/1.2 = 41.6mm.
ISO Setting (Sensor Sensitivity)
 Determines how bright image will appear for a given # collected photons.
CMOS (Complementary Metal–Oxide–Semiconductor 互補式金屬氧化物半導體)
CMOS Imange Sensor
 FrontIlluminated Structure
 BackIlluminated Structure
 Aka BackSide Illuminated (BSI) CMOS sensor.
Demosaicing 去馬賽克
 Missing RGB values filled in computationally thru a process.
RAW Image
Developed Image
Intensity Mapping
Photon 光子
 Photon arriving at a pixel generate photoelectrons.
 Photoelectrons pass thru an amplifier that converts accumulated charge to a measurable voltage.
 Voltage converted to a digital number with an AtoD (Analog to Digital) converted.
 This number may be further transformed before being output as a pixel intensity (or color) value.
PhotoElectrons Collected at a Pixel
 Radiant power from scene Φ (Flux)
 PhotoElectrons per second
 Exposure time t
 Sensor Saturation
 Gain factor 1/g
 Quantized digital value (DN)
 DN: Digital Number, which has a linear relationship to photoelectrons.
HVS (Human Visual System)
 HVS does not have a linear response.
 So linear imgs don’t look good for humans.
 DNs are passed thru a gamma function f(DN) = β(DN)^(1/γ) to approxly compensate for HVS.
Gamma Correction
 用來針對影片或是影像系統裡對於光線的輝度（Luminance）或是三色刺激值（Tristimulus Values）所進行非線性的運算或反運算。
Noise
 Dark current noise D
 Free electrons due to thermal energy
 Depends on temperature, measured in electrons/sec independent of Φt
 Poisson ≈ Normal
 mean = Dt (D: Rate of creation of thermal photons)
 sd = sqrt(Dt)
 Photon noise
 Photon arrivals are random
 Depends on total photon arrivals Φt
 Poisson ≈ Normal
 mean = Φt
 sd = sqrt(Φt)
 Prob(# Received photons = k)
 = ((Φt)^k / k!) * e^(Φt)
 Readout noise
 Noise from readout electronics
 Independent of Φt
 Normal
 mean = 0
 sd = σ_r
 Prob(Read Noise = x)
 = (1 / (σ_r * (2?)^(1/2))) * e^(x^2/(2σ_r^2))
 ADC noise
 Amplifier AtoD, quantization noise
 Independent of Φt
 Normal
 mean = 0
 sd = σ_ADC
 Pixel noise
 Depends on exposure level (# photons reaching sensor)
Mean & Var
 mean(e^) = min(I_0 + Φt + Dt, I_m)
 var(e^) = Φt + Dt + I_0 + σ_r^2 + σ_ADC^2 * g^2
 mean(DN) = min((I_0 + Φt + Dt)/g, I_m/g)
 var(DN)
 = Φt/g^2 // Photon

 Dt/g^2 // Dark current

 (I_0 + σ_r^2)/g^2 // Additive

 σ_ADC^2
SNR (Signal to Noise Ratio)
 SNR = 10 log_10 (mean(DN)^2 / var(DN))
Rolling Shutter 捲簾快門
 不同行像元的曝光时间不同
Global Shutter 全局快門
 所有像元同时曝光
Alpha (α)
 Pixels are assumed to have 4 rather than 3 components.
 RGBα
 α: Alpha component
 Assumed to have fractional values btw 0 & 1.
 When represented as 1B, carries values α=0/255, 1/255, …, 254/255, 255/255
 Defines pixel “Transparency”
 [R G B α] def== [αR αG αB]
BlueScreen Matting
 Matting images whose background color is blue.
Matting Problem: Mathematical Definition
 C_k = [R_k G_k B_k]
 C = [R G B]
 C_0 = [R_0 G_0 B_0] def== [R_0/α G_0/α B_0/α α]
Matting Equation
 C = C_0 + (1α_0) * C_k
 More generally
 C = C_0 + C_k  α_0 * C_k
 Define C_Δ = C  C_k
 Then C_Δ = C_0  α_0 * C_k
ln Matrix Form
 [R_Δ G_Δ B_Δ] = [1R_k 1G_k 1B_k] * [R_0 G_0 B_0 α_0]
 [1R_k 1G_k 1B_k] is NOT invertible

equations < # unknowns, thus infinite solutions.
Matting Solutions
 α = 0 (C = C_k), α = 1 (Otherwise)
 No blue (Β_0 = 0)
 Gray (R_0 = G_0 = B_0) or Flesh (C_0 = [d 0.5d 0.5d])
 Triangulation Matting
HDR (HighDynamic Range) Photography
 8bit values at corresponding pixel in captured photos are converted to a single HDR floating point value.
 Most common approach: Change exposure time.
Relation btw exposure time & intensity
 Suppose we take 2 photos A & B with exposure times Δt & Δt/2. Intensity at pixel (x,y) will satisfy:
 A(x,y) = 2 B(x,y) (T if there is no noise)
 A(x,y) = B(x,y) (F)
 A(x,y) = B(x,y) / 2 (F)
 None of above unless image is RAW
From many 8bit to 1 floatingpoint intensity
 Todo this, pixel intensities must have a linear relation to rate of incident photons.
 If photos are not RAW already, we need to know INVERSE camera response function.
Basic Merging Procedure
for each pixel (x,y)
for each photo j
if (x,y) is not saturated in j
convert pixel intensity z_j (x,y) to # photons E_j (x,y)
// Requires knowledge of camera response function
merge E_1 (x,y), E_2 (x,y) into 1 floating point value
Camera Response Function
 General Procedure
 Collect photos for many exposure intervals Δt_1, Δt_2, … w/o moving camera.
 Process photos to compute f
 Problem:
 For a given photo, we will know Δt_j and z but we have no way of measuring E.
 Ideas
 LogInverse Response Function g()
 Invert & use logs
 Z = f(E Δt)
 f^1 (Z) = E Δt
 log(f^1 (Z)) = log(E) log(Δt)
 Compute Discrete Values of g(), not f()
 256 unknown quantities we must estimate to fully determine g().
 g(z_ij) = log E_i + log Δt_j
 i: ith pixel
 j: jth image
 Δt_j: Exposure interval of jth image
 Each pixel in each photo contributes 1 equation.
 LogInverse Response Function g()
LogInverse Response Function g()
 g(Z) = log(E) + log(Δt)
Inverse Camera Response Function
Smoothness Constraints (Regularization)
 Idea: Camera response function changes smoothly in real cameras.
 We add more equations to enforce smoothness.
 Intuition: Force nearconstant rate of change.
 g_100  g_99 ≃ g_101  g_100
 <=> 2g_100  g_99  g_101 ≃ 0
 Add these equations to system:
 l = 1, …, 254
 2g_l  g_(l+1)  g_(l1) ≃ 0
Laplacian
Local Image Patches
 Local level (Small groups)
 global level (Entire image)
Image Patch
 Corner
 Edge
 Uniform
 Single Surface
 Feature
Surface Patch
z = I(x,y)
Sliding Window
 Define a pixel window centered at pixel (w,r)
 Fit ndegree poly to window’s intensities (usually n = 1 or 2)
 Assign poly’s derivatives at x=0 to pixel at window’s center
 Slide window one pixel over, so that it is centered at pixel (w+1,r)
 Repeat 14 until window reaches right image border
LeastSquares Fit
 Solving linear system in terms of d minimizes the “fit error”
  I  Xd ^2
 Solution d is called a leastsquares fit
Weight Function
 Ω(x)
Estimating Derivatives of 1D intensity patches:
 LeastSquares Fitting
 Weighted leastsquares fitting
 Robust polynomial fitting: RANSAC
Image Inpainting Algorithm
 Source Φ
 Target Ω
 P(p) = C(p) D(p)
 Confident C(p)
 Measure of amount of reliable info surrounding pixel p
 Approxly desired fill order
 Data D(p)
 Strength of isophote hitting fill font at each iteration (strong edges)
 Encourage linear structures to be synthesized first
RANSAC Algorithm
 Random Sample Consensus
 n = degree of poly
 p = fraction of inliers
 t = fit threshold
 p_s = success prob
 Prob to choose an inlier pixel
 P
 Prob to choose (n+1) inlier pixels
 P^(n+1)
 Prob at least 1 outlier chosen
 1  P^(n+1)
 Prob at least 1 outlier chosen in all K trials
 (1  P^(n+1))^K
 Failure Prob
 (1  P^(n+1))^K
 Success Prob
 P_s = 1  (1  P^(n+1))^K
 K
 (log(1P_s)) / (log(1P^(n+1)))
Isophote
 Group of pixels that have same intensity
Parameterized 2D Curve
 A cont mapping
 γ: (a,b) > R^2
 a: Value of param at beginning
 b: Value of param at end
 t > (x(t), y(t))
 t: Curve param (indicates position along curve)
 (x(t), y(t)): Point along curve at position t
Smooth Curve
 When all derivatives of cord fns exist
 (d^n x) / dt (t), (d^n y) / dt (t)
Path
A sequence of adjacent pixels in image
Link
A pair of adjacent pixels along path
Intelligent Scissor
 Given a link defined by pixels p & q, its weight is defined to be
 l(p,q) = 0.43f_z(q) + 0.43f_D(p,q) + 0.14f_G(q)
 f_z(q) = 0 if the Laplacian has a 0crossing at q
 f_z(q) = 1 otherwise
 f_D(p,q): term that penalizes links not consistent with gradient direction at p and q
 f_G(q) = 1  (…I) / max(…I)
 max(): Largest value over image
Path Optimizer
 Input
 Seed: Start pixel
 Graph: Edgeweighted graph corresponding to local cost function l(q,r)
 Output
 A tree on graph, with each node pointing to its successor along min cost path from that node to root seed.
 Note
 Each node experiences 3 states in seq:
 INITIAL
 ACTIVE
 EXPANDED
 Each node experiences 3 states in seq:
Eigenvector & Eigenvalue
 Vector v != [0, …, 0] is an eigenvector of matrix H if
 Hv = λv
 Scalar λ is called eigenvalue of v
 Properties
 If v is an eigenvector and k != 0 and is a constant, kv is an eigenvector => is v != [0, …, 0], we only need to consider UNIT eigenvectors (v = 1)
 If v is an eigenvector and H is symmetric => v^T H = λ v^T
 If v_1, v_2 are eigenvectors with λ_1 != λ_2 and H is symmetric => v_1^T v_2 = 0 (v_1 and v_2 are orthogonal)
 Sum of eigenvalues of a matrix H is equal to its trace.
 For a 2*2 matrix [[a, b], [c, d]] => a+d = λ_1 + λ_2
 tr(H) = sum of diagonal elements of H.
 PRODUCT of eigenvalues of a matrix H is equal to its DETERMINANT.
 For a 2*2 matrix [[a, b], [c, d]] => adbc = λ_1 λ_2
 det(H) = determinant of matrix H
Lowe Feature Detector
 Goal: Find pixels that are very diff from their surroundings.
 Idea: Apply Hessian eigenvalue analysis to Laplacian of an img
Harris/Forstner Corner Detector
 Goal: Find pixels that are very diff from their neighborhood.