An Introduction to Computer Vision Matthew Dockrey  June 18, 2009
Introduction State of the art in computer vision PASCAL Visual Objects Challenge Semantic Robot Vision Challenge
PASCAL VOC Data Table by Mark Everingham
PASCAL VOC Results Table by Mark Everingham
SRVC
SRVC Results Table by Paul E. Rybski Alexei Efros
What is computer vision? Step 1: Image analysis/feature extraction Step 2: Machine learning/statistical analysis Step 3: Profit?
Scale-invariant feature transform (SIFT) The standard image feature / descriptor David Lowe, 2004 The algorithm is patented, but free for non-commercial use Binaries only for original implementation, but GPL'd versions exist
SIFT – Point Detection Images by David Lowe
SIFT – Point Descriptor Image by David Lowe
SIFT Feature SIFT = Row location Column location Orientation Scale 128 dimensional vector
SIFT Image Matching 1) Extract SIFT features 2) Match using nearest-neighbor search 3) Apply semi-local constraints Orientation Scale Location 4) Huzzah!
Goal: Object Classifier Most object classes won't share exact SIFT features Need to abstract properties of the class into a form that we can reason with
...do we really need context?
No. Image parts: Thomas Hawk
Image: Thomas Hawk
Bag of Words Comes from computational linguistics, document matching Cluster features into codebook words (Using k-means, usually) Image descriptor is a histogram vector counting how many times each word is seen [5  8  14  2  12  4  3  5  11  26  1  3  ...]
Support Vector Machine (SVM) Very popular and reasonably fast Given a set of training vectors and their labels, will build a classifier which will give a label for any other vector (It is trying to find a hyperplane which maximizes the margin between the classes)
Training Data But where do we get training data? Dangers of data sets Background class? Well framed? Sufficient variation?
Bag of Words Classifier Putting it all together For each training image Extract SIFT features Cluster into codebook words Generate the vectors Train the SVM on these vectors To test an image Extract SIFT features Generate the vector Use the SVM to predict the class
Classifier Results 68.9% accurate!  (Not bad for a couple hours of programming...) Confusion matrix:
Any questions?
Other Machine Learning Systems Randomized trees/forests Image: Wikipedia
Other Machine Learning Systems Boosting Image: Kihwan Kim
Viola Jones Face Detector Image: Kihwan Kim
Summed Area Table /  Integral Image Image: Nvidia

An Introduction to Computer Vision

  • 1.
    An Introduction toComputer Vision Matthew Dockrey June 18, 2009
  • 2.
    Introduction State ofthe art in computer vision PASCAL Visual Objects Challenge Semantic Robot Vision Challenge
  • 3.
    PASCAL VOC DataTable by Mark Everingham
  • 4.
    PASCAL VOC ResultsTable by Mark Everingham
  • 5.
  • 6.
    SRVC Results Tableby Paul E. Rybski Alexei Efros
  • 7.
    What is computervision? Step 1: Image analysis/feature extraction Step 2: Machine learning/statistical analysis Step 3: Profit?
  • 8.
    Scale-invariant feature transform(SIFT) The standard image feature / descriptor David Lowe, 2004 The algorithm is patented, but free for non-commercial use Binaries only for original implementation, but GPL'd versions exist
  • 9.
    SIFT – PointDetection Images by David Lowe
  • 10.
    SIFT – PointDescriptor Image by David Lowe
  • 11.
    SIFT Feature SIFT= Row location Column location Orientation Scale 128 dimensional vector
  • 12.
    SIFT Image Matching1) Extract SIFT features 2) Match using nearest-neighbor search 3) Apply semi-local constraints Orientation Scale Location 4) Huzzah!
  • 13.
    Goal: Object ClassifierMost object classes won't share exact SIFT features Need to abstract properties of the class into a form that we can reason with
  • 14.
    ...do we reallyneed context?
  • 15.
    No. Image parts:Thomas Hawk
  • 16.
  • 17.
    Bag of WordsComes from computational linguistics, document matching Cluster features into codebook words (Using k-means, usually) Image descriptor is a histogram vector counting how many times each word is seen [5 8 14 2 12 4 3 5 11 26 1 3 ...]
  • 18.
    Support Vector Machine(SVM) Very popular and reasonably fast Given a set of training vectors and their labels, will build a classifier which will give a label for any other vector (It is trying to find a hyperplane which maximizes the margin between the classes)
  • 19.
    Training Data Butwhere do we get training data? Dangers of data sets Background class? Well framed? Sufficient variation?
  • 20.
    Bag of WordsClassifier Putting it all together For each training image Extract SIFT features Cluster into codebook words Generate the vectors Train the SVM on these vectors To test an image Extract SIFT features Generate the vector Use the SVM to predict the class
  • 21.
    Classifier Results 68.9%accurate! (Not bad for a couple hours of programming...) Confusion matrix:
  • 22.
  • 23.
    Other Machine LearningSystems Randomized trees/forests Image: Wikipedia
  • 24.
    Other Machine LearningSystems Boosting Image: Kihwan Kim
  • 25.
    Viola Jones FaceDetector Image: Kihwan Kim
  • 26.
    Summed Area Table/ Integral Image Image: Nvidia