Computer Vision - CISC 489/689
ASSIGNMENT INFORMATION
- ASSIGNMENT 1
- Two useful code snippets can be obtained from
Peter Kovesi's website
(Harris.m and imTrans.m)
- In the Matlab scripts file, there are four driver programs.
- The driver programs (MainDriver_*.m) are scripts that call the individual functions, which may or may not have to be implemented (Depending on extra credit).
- This should give an idea of how to format your code and what functions have to be called
- The images were created using POV-ray
software by modifying the
lookat direction. I used orthographic.pov and sunsethf.pov
- DATA files - The data is organized as follows. There are three directories when you unzip the data.zip.
- Orthographic
- Image obtained under orthographic
projection. There are 4 images in the folder - orthographic[1|2|3|4].png
(orthographic1.png had a rotational transformation w.r.t.
orthographic2.png and orthographic4.png has an affine transformation
w.r.t. orthographic3.png)
- Using ginput, compute the affine transformation between the
image pairs
orthographic1.png & orthographic2.png and orthographic3.png &
orthographic4.png
- Use the estimated parameters to "unwarp" the affine deformation from the images.
- Use the "rectified" image to make the mosaic
- Weakly Perspective
- Images having small perspective distortion. There are three sets of
images - mh[1|2|3].png, sunsethf[1|2|3].png and ud[1|2|3|4].png.
(mh1.png and mh3.png have a rotational transformation in them w.r.t.
mh2.png)
(The other two sets have only translational components)
- Using ginput, compute the rotational transformation between the
image pairs
mh1.png & mh2.png and mh2.png & mh3.png
- Use the estimated theta(s) to "unwarp" the images.
- Use the "rectified" image to make the mosaic
- Perspective
This might be useful for homography
estimation (estimation of projective transformation). The images in this
folder are balcony[1|2|3].png
(There is significant perspectivity evident in these images.
)
- To mosaic the images, 8 projective transformation parameters have to be solved.
- Can be done on similar lines but need to get at least 4 points using ginput.
- USEFUL ASSIGNMENT TIPS
- Page 222 of the
paper
describes conditions for the construction of the Image pyramid i.e. N, M_r and M_c
- Select N such that the image size can be decomposed as M_r*2N and M_c*2N, where M_r and M_c are integers.
- Thus the number of pyramid levels will depend on the image size for example if the image is 512x384, then M_r = 3, M_c = 4 and N = 7
- Apple.jpg and Orange.jpg for testing the vertical bitmask functions
(Thanks, Michael Haggerty !!!).
- It has also been added to the data.zip incase you want to download the entire dataset!!!
- USEFUL MATLAB TIPS
- Use uint8 when writing back images into image files.
- You can do the processing as doubles but when you write the image back as a .bmp use uint8 to type cast the data.
- When displaying the images, especially the Laplacian, use imagesc(I) or imshow(I, []) to display the images in a scaled environment
- If you want to show multiple plots in the same image, the subplot function can do it for you
- Use the cell structure to store data of different types. When u subsample the image, having a 3D array of fixed dimensions may prove to be painful
- When displaying images using image/imagesc, use the colormap() to change the display colormap
- This would not be a problem if you are using imshow()
- The separable Gaussian can be created by substituting for x with (-2,
-1, 0, 1, 2) in the equation of the Gaussian (assume sigma = 1)
- ASSIGNMENT 2
- DATA files
- USEFUL ASSIGNMENT TIPS
- Slides 55 and 56 of the lecture (MultipleViews1.ppt) on
3/23/2006 gives SSD and Cross-correlation respectively
- This
assignment requires you to compute the disparity between two stereo
image pairs. The input to your program are the two images and the
number of levels for the multi scale option. The output from your
code would be the disparity as estimated by your algorithm.
- Since the two images are rectified, for every point in the left image you can limit your search to the corresponding scanline in the
right image. So if u are currently searching for the block located at (5, 10) in the left image, you only need to search for correspondence on the 10th row in the right image.
- You are thus, computing a number at ever pixel in the left image that has the best correspondence on the scan line in the right image. So the disparity
can be represented as a matrix of numbers where the rows and columns of the matrix equals the size of the image (You need to handle
boundary conditions.
- You can also check how good your estimate is by computing the
difference between the estimated disparity and the ground truth.
- How does Left-to-right and right-to-left comparison work?
- Once you got a left-to-right match, switch right and left labels and do stereo analysis again, question is will the
correspondence remain the same.
If yes within a threshold, it is good. else it is a hole. - Chandra
- I am not sure how we are going to display the stereo analysis results
- Disparity is a matrix of floating point numbers having same dimension as left/right image (at each level) you can diplay this as a surface or intensity. - Chandra
- Just think of the matrix as a collection of numbers that you would like matlab to display.
- The disparity could vary between, say 20 and 50 i.e.
the maximum disparity over the entire disparity image is 50 and the minimum is 20.
- An image requires the intensity to lie between 0 and 255 so linearly scale the
numbers from 20 to 50 so as to lie between 0 and 255. x/(50 - 20)*255.
- This is accomplished by imshow as imshow(A, [min(min(D)), max(max(D))]). Read the description of imshow
- What should be the output of the assignment?
- The output image is disparity. You can display the floating point image, or render it as a surface.
Texture map can be used if you are displaying as surface, which is the reference image. - Chandra
- I have attached the satellite data in zip format, in case you are having difficulty in viewing the raw format. The images in the zip files are in the png format. The images are contrast enhanced to map the greyscale (single channel) values from unsigned short (16 bits/channel) to char (8 bits/channel)
- USEFUL MATLAB TIPS
- Avoid Looping constructs
- Use ":" operator to perform the same kind of operation
- Check on the functions colfilt, blkproc and nlfilter in MATLAB help.
- These function can be very useful when you are performing similar operations across the entire image
- When performing multiscale disparity estimation, using these functions might be slightly convoluted
- Calling C code in Matlab
- Typically, when a particular section of the matlab code tends to become a performance bottleneck, an option to improve efficiency is to rewrite that section of code in C, create a dll using mex and then call the library in matlab
- Assumption: There is some C compiler installed on your machine
- Run mex -setup at the matlab command prompt
- Run mex myfile.c to generate myfile.dll
- Set the path (present in File->menu), so that the matlab finds the library when the function is called
- Read the help on mex for further details
- prcorr2 (written by Peter Rydesäter) is available at
Matlab Central. The dll is much faster than the matlab version
of corr2.
- Dr. E. Simoncelli has a
collection of matlab code that are very useful in generating image pyramids and performing various operations on them. They are very efficient and you might find it useful
- Please acknowledge the authors if you are using their codes
- If u have a matrix of size [n, m] and another matrix of size [u, v], then the output from conv2 is a matrix of
size [n + u - 1, m + v -1].
- By default, functions like conv2, output the full matrix of convolution values.
- But it also gives you an option of extracting valid data via parameters that are passed into
it.
- If you use 'same' as an input parameter, it outputs a
matrix of size [u, v].