This document provides an introduction to computer vision. It summarizes the state of the field, including popular challenges like PASCAL VOC and SRVC. It describes commonly used algorithms like SIFT for feature extraction and bag-of-words models. It also discusses machine learning methods applied to computer vision like support vector machines, randomized forests, boosting, and Viola-Jones face detection. Examples of results from applying these techniques to object classification problems are also provided.
What is computervision? Step 1: Image analysis/feature extraction Step 2: Machine learning/statistical analysis Step 3: Profit?
8.
Scale-invariant feature transform(SIFT) The standard image feature / descriptor David Lowe, 2004 The algorithm is patented, but free for non-commercial use Binaries only for original implementation, but GPL'd versions exist
SIFT Image Matching1) Extract SIFT features 2) Match using nearest-neighbor search 3) Apply semi-local constraints Orientation Scale Location 4) Huzzah!
13.
Goal: Object ClassifierMost object classes won't share exact SIFT features Need to abstract properties of the class into a form that we can reason with
Bag of WordsComes from computational linguistics, document matching Cluster features into codebook words (Using k-means, usually) Image descriptor is a histogram vector counting how many times each word is seen [5 8 14 2 12 4 3 5 11 26 1 3 ...]
18.
Support Vector Machine(SVM) Very popular and reasonably fast Given a set of training vectors and their labels, will build a classifier which will give a label for any other vector (It is trying to find a hyperplane which maximizes the margin between the classes)
19.
Training Data Butwhere do we get training data? Dangers of data sets Background class? Well framed? Sufficient variation?
20.
Bag of WordsClassifier Putting it all together For each training image Extract SIFT features Cluster into codebook words Generate the vectors Train the SVM on these vectors To test an image Extract SIFT features Generate the vector Use the SVM to predict the class
21.
Classifier Results 68.9%accurate! (Not bad for a couple hours of programming...) Confusion matrix: